Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

6194 results about "Cluster algorithm" patented technology

The Microsoft Clustering algorithm provides two methods for creating clusters and assigning data points to the clusters. The first, the K-means algorithm, is a hard clustering method. This means that a data point can belong to only one cluster, and that a single probability is calculated for the membership of each data point in that cluster.

Systems and methods for investigation of financial reporting information

Financial data including general ledger activity and underlying journal entries are examined to determine whether risks of material misstatement due to fraudulent financial reporting can be identified. The financial data is analyzed statistically and modeled over time, comparing actual data values with predicted data values to identify anomalies in the financial data. The anomalous financial data is then analyzed using clustering algorithms to identify common characteristics of the various transactions underlying the anomalies. The common characteristics are then compared with characteristics derived from data known to derive from fraudulent activity, and the common characteristics are reported, along with a weight or probability that the anomaly associated with the common characteristic is an identification of risks of material misstatement due to fraud. Large volumes of financial data are therefore efficiently processed to accurately identify risks of material misstatement due to fraud in connection with financial audits, or for actual detection of fraud in connection with forensic and investigative accounting activities. The analysis is enhanced by using flow analysis methods to select subsets of financial data to examine for anomalies. Flow analysis methods are also used to reveal useful business information found in money flow graphs of financial data.
Owner:PRICEWATERHOUSECOOPERS LLP

System and method for detection of domain-flux botnets and the like

In one embodiment, a method for detecting malicious software agents, such as domain-flux botnets. The method applies a co-clustering algorithm on a domain-name query failure graph, to generate a hierarchical grouping of hosts based on similarities between domain names queried by those hosts, and divides that hierarchical structure into candidate clusters based on percentages of failed queries having at least first- and second-level domain names in common, thereby identifying hosts having correlated queries as possibly being infected with malicious software agents. A linking algorithm is used to correlate the co-clustering results generated at different time periods to differentiate actual domain-flux bots from other domain-name failure anomalies by identifying candidate clusters that persist for relatively long periods of time. Persistent candidate clusters are analyzed to identify which clusters have malicious software agents, based on a freshness metric that characterizes whether the candidate clusters continually generate failed queries having new domain names.
Owner:RPX CORP

Apparatus and method for discovering context groups and document categories by mining usage logs

An apparatus is provided for relating user queries and documents. The apparatus includes a client, a server, and a database being mutually coupled to a communications pathway. The client is configured to enable a user to submit user queries to locate documents. The server has a data mining mechanism configured to receive the user queries and generate information retrieval sessions. The database stores data in the form of usage logs generated from the information retrieval sessions. The data mining mechanism includes a clustering algorithm operative to identify context groups and usage categories. The data mining mechanism is operative to identify query contexts associated with individual queries from the usage logs, partition the queries into context groups having similar contexts, and compute multiple context groups associated with specific query keywords from the usage logs. A method is provided for associating user queries and documents in accordance with the apparatus.
Owner:GOOGLE LLC

System and methods for vital sign estimation from passive thermal video

A system for measuring a pulse and respiratory rate from passive thermal video includes contour segmentation and tracking, clustering of informative pixels of interests, and robust dominant frequency component estimation. Contour segmentation is used to locate a blood vessel region to measure, after which all pixels in the nearby region are aligned across frames based on the segmentation's position, and scale in each frame. Spatial filtering is then performed to remove noise not related to heart beat and then non-linear filtering is performed on the temporal signal corresponding to each aligned pixel. The signal spectrum of each pixel is then feed to a clustering algorithm for outlier removal. Pixels in the largest cluster are then used to vote for the dominant frequency, and the median of the dominant frequency is output as the pulse rate.
Owner:FUJIFILM BUSINESS INNOVATION CORP

Method and apparatus for automatic file clustering into a data-driven, user-specific taxonomy

An automatic file clustering algorithm enables documents within a file system to be displayed in a semantic view. The file clustering algorithm maps all words and documents into an appropriate semantic vector space, clusters the documents at a predetermined level of granularity, and assigns a meaningful descriptor to each resulting cluster. The documents are displayed to the user in a hierarchy in accordance with the resulting clusters. This results in a virtual file system with a semantic organization, that allows the user to navigate by content.
Owner:APPLE INC

Techniques for clustering structurally similar webpages based on page features

Web page clustering techniques described herein are URL Clustering and Page Clustering, whereby clustering algorithms cluster together pages that are structurally similar. Regarding URL clustering, because similarly structured pages have similar patterns in their URLs, grouping similar URL patterns will group structurally similar pages. Embodiments of URL clustering may involve: (a) URL normalization and (b) URL variation computation. Regarding page clustering, page feature-based techniques further cluster any given set of homogenous clusters, reducing the number of clusters based on the underlying page code. Embodiments of page clustering may reduce the number of clusters based on the tag probabilities and the tag sequence, utilizing an Approximate Nearest Neighborhood (ANN) graph along with evaluation of intra-cluster and inter-cluster compactness.
Owner:R2 SOLUTIONS

System and method for interaction between users of an online community

There is disclosed a method of facilitating interaction between users of an electronic community. In an embodiment, the method comprises: reviewing a user activity log for each user in the electronic community; executing a natural language parser to extract significant noun phrases from the user activity log; updating user profiles from the newly extracted noun phrases, based on their usage frequency and importance value; and storing the updated profiles in a user profile and relationship data base; and executing a similarity based clustering algorithm to cluster user profiles, thereby discovering relationships among users and storing them in a user profile and relationship database. The method may further comprise displaying for each user the one or more relationships to which the user is assigned, together with a list of users assigned to the one or more relationships. The method may also comprise storing for each user the relationship to which the user is assigned in a user profile and relationship database.
Owner:KYNDRYL INC

Method for updating voiceprint feature model and terminal

The invention is suitable for the technical field of voice recognition and provides a method for updating a voiceprint feature model. The method comprises the following steps of: obtaining an original audio stream of at least one speaker, and obtaining the audio stream of each speaker in the at least one speaker in the original audio stream according to a preset speaker segmentation and clustering algorithm; matching the respective audio stream of each speaker in the at least one speaker with an original voiceprint feature model, so as to obtain the successfully-matched audio stream; and using the successfully-matched audio stream as an additional audio stream training sample used for generating the original voiceprint feature model, and updating the original voiceprint feature model. According to the invention, the valid audio stream during a conversation process is adaptively extracted and used as the additional audio stream training sample, and the additional audio stream training sample is used for dynamically correcting the original voiceprint feature model, thus achieving a purpose of improving the precision of the voiceprint feature model and the accuracy of recognition under the premise of high practicability.
Owner:HUAWEI DEVICE CO LTD

Context adaptive approach in vehicle detection under various visibility conditions

Adaptive vision-based vehicle detection methods, taking into account the lighting context of the images are disclosed. The methods categorize the scenes according to their lighting conditions and switch between specialized classifiers for different scene contexts. Four categories of lighting conditions have been identified using a clustering algorithm in the space of image histograms: Daylight, Low Light, Night, and Saturation. Trained classifiers are used for both Daylight and Low Light categories, and a tail-light detector is used for the Night category. Improved detection performance by using the provided context-adaptive methods is demonstrated. A night time detector is also disclosed.
Owner:CONTINENTAL AUTOMOTIVE GMBH

Context adaptive approach in vehicle detection under various visibility conditions

InactiveUS20080069400A1Alleviate visibility limitationEasy to detectScene recognitionVision basedEffect light
Adaptive vision-based vehicle detection methods, taking into account the lighting context of the images are disclosed. The methods categorize the scenes according to their lighting conditions and switch between specialized classifiers for different scene contexts. Four categories of lighting conditions have been identified using a clustering algorithm in the space of image histograms: Daylight, Low Light, Night, and Saturation. Trained classifiers are used for both Daylight and Low Light categories, and a tail-light detector is used for the Night category. Improved detection performance by using the provided context-adaptive methods is demonstrated. A night time detector is also disclosed.
Owner:CONTINENTAL AUTOMOTIVE GMBH

Cluster-based management of collections of items

Computer-implemented processes are disclosed for clustering items and improving the utility of item recommendations. One process involves applying a clustering algorithm to a user's collection of items. Information about the resulting clusters is then used to select items to use as recommendation sources. Another process involves displaying the clusters of items to the user via a collection management interface that enables the user to attach cluster-level metadata, such as by rating or tagging entire clusters of items. The resulting metadata may be used to improve the recommendations generated by a recommendation engine. Another process involves forming clusters of items in which a user has indicated a lack of interest, and using these clusters to filter the output of a recommendation engine. Yet another process involves applying a clustering algorithm to the output of a recommendation engine to arrange the recommended items into cluster-based categories for presentation to the user.
Owner:AMAZON TECH INC

Systems and/or methods for dynamic anomaly detection in machine sensor data

Certain example embodiments relate to techniques for detecting anomalies in streaming data. More particularly, certain example embodiments use an approach that combines both unsupervised and supervised machine learning techniques to create a shared anomaly detection model in connection with a modified k-means clustering algorithm and advantageously also enables concept drift to be taken into account. The number of clusters k need not be known in advance, and it may vary over time. Models are continually trainable as a result of the dynamic reception of data over an unknown and potentially indefinite time period, and clusters can be built incrementally and in connection with an updatable distance threshold that indicates when a new cluster is to be created. Distance thresholds also are dynamic and adjustable over time.
Owner:SOFTWARE AG USA

Techniques for clustering structurally similar web pages

Web page clustering techniques described herein are URL Clustering and Page Clustering, whereby clustering algorithms cluster together pages that are structurally similar. Regarding URL clustering, because similarly structured pages have similar patterns in their URLs, grouping similar URL patterns will group structurally similar pages. Embodiments of URL clustering may involve: (a) URL normalization and (b) URL variation computation. Regarding page clustering, page feature-based techniques further cluster any given set of homogenous clusters, reducing the number of clusters based on the underlying page code. Embodiments of page clustering may reduce the number of clusters based on the tag probabilities and the tag sequence, utilizing an Approximate Nearest Neighborhood (ANN) graph along with evaluation of intra-cluster and inter-cluster compactness.
Owner:R2 SOLUTIONS

Method And System For Semantically Segmenting Scenes Of A Video Sequence

A shot-based video content analysis method and system is described for providing automatic recognition of logical story units (LSUs). The method employs vector quantization (VQ) to represent the visual content of a shot, following which a shot clustering algorithm is employed together with automatic determination of merging and splitting events. The method provides an automated way of performing the time-consuming and laborious process of organising and indexing increasingly large video databases such that they can be easily browsed and searched using natural query structures.
Owner:BRITISH TELECOMM PLC

Cluster-based assessment of user interests

Computer-implemented processes are disclosed for clustering items and improving the utility of item recommendations. One process involves applying a clustering algorithm to a user's collection of items. Information about the resulting clusters is then used to select items to use as recommendation sources. Another process involves displaying the clusters of items to the user via a collection management interface that enables the user to attach cluster-level metadata, such as by rating or tagging entire clusters of items. The resulting metadata may be used to improve the recommendations generated by a recommendation engine. Another process involves forming clusters of items in which a user has indicated a lack of interest, and using these clusters to filter the output of a recommendation engine. Yet another process involves applying a clustering algorithm to the output of a recommendation engine to arrange the recommended items into cluster-based categories for presentation to the user.
Owner:AMAZON TECH INC

System and method to optimize control cohorts using clustering algorithms

A computer implemented method, apparatus, and computer usable program code for automatically selecting an optimal control cohort. Attributes are selected based on patient data. Treatment cohort records are clustered to form clustered treatment cohorts. Control cohort records are scored to form potential control cohort members. The optimal control cohort is selected by minimizing differences between the potential control cohort members and the clustered treatment cohorts.
Owner:LINKEDIN

System and method to optimize control cohorts using clustering algorithms

A computer implemented method, apparatus, and computer usable program code for automatically selecting an optimal control cohort. Attributes are selected based on patient data. Treatment cohort records are clustered to form clustered treatment cohorts. Control cohort records are scored to form potential control cohort members. The optimal control cohort is selected by minimizing differences between the potential control cohort members and the clustered treatment cohorts.
Owner:LINKEDIN

Improved positioning method of indoor fingerprint based on clustering neural network

The invention discloses the technical field of wireless communication and wireless network positioning, and in particular relates to an improved positioning method of an indoor fingerprint based on a clustering neural network. According to the technical scheme, the positioning method is characterized by comprising the following steps of: an offline phase: constructing a fingerprint database by fingerprint information collected from a reference point, sorting fingerprints in the fingerprint database by utilizing a clustering algorithm, and training the fingerprint and position information of each reference point by utilizing a artificial neural network model to obtain an optimized network model; and an online phase: carrying out cluster matching on the collected real-time fingerprint information and a cluster center in the fingerprint database to determine a primary positioning area, and taking the real-time fingerprint information in the primary positioning area as an input end of the neural network model of the reference point to acquire final accurate position estimation. The method has the advantages that low calculation and storage cost for the clustering artificial neural network fingerprint positioning method can be guaranteed, the positioning accuracy of the clustering artificial neural network fingerprint positioning method can be improved, and accurate positioning information is provided for users.
Owner:BEIJING JIAOTONG UNIV

Method and system for detecting malicious application

A malicious applications detection method is provided. The method includes: extracting a plurality of static features from a manifest file and a de-compiled code respectively obtained from a plurality of training malicious applications (APK files) and a plurality of training benign applications (APK files); generating at least one malicious application group using a clustering algorithm and generating at least one benign application group; generating application detecting models respectively representing the malicious and benign application groups based on static features of the training malicious and benign applications in each malicious application group and each benign application group; extracting target static features from a target manifest file and a target de-compiled code of a target application; using a classification algorithm, the target static features, and the application detecting models to determine whether the target application belongs to the malicious application group; and generating a warning message when a determination result is positive.
Owner:NAT TAIWAN UNIV OF SCI & TECH

Cluster-based categorization and presentation of item recommendations

Computer-implemented processes are disclosed for clustering items and improving the utility of item recommendations. One process involves applying a clustering algorithm to a user's collection of items. Information about the resulting clusters is then used to select items to use as recommendation sources. Another process involves displaying the clusters of items to the user via a collection management interface that enables the user to attach cluster-level metadata, such as by rating or tagging entire clusters of items. The resulting metadata may be used to improve the recommendations generated by a recommendation engine. Another process involves forming clusters of items in which a user has indicated a lack of interest, and using these clusters to filter the output of a recommendation engine. Yet another process involves applying a clustering algorithm to the output of a recommendation engine to arrange the recommended items into cluster-based categories for presentation to the user.
Owner:AMAZON TECH INC

Density clustering-based self-adaptive trajectory prediction method

The invention discloses a density clustering-based self-adaptive trajectory prediction method which comprises a trajectory modeling stage and a trajectory updating stage, wherein in the trajectory modeling stage, rasterizing treatment is carried out on a newly generated movement report, so that moving points can be obtained and are divided into six moving point subsets; the six moving point subsets are clustered by adopting a limited area data sampling-based density clustering algorithm, so that a new trajectory cluster can be formed; the new trajectory cluster and an old trajectory cluster in the same period of time are merged with each other according to the similarity of the trajectory points, and the trajectory points of the merged trajectory cluster and the area of influence are updated; the trajectory points are combined according to the time sequence, so that a complete user movement trajectory can be obtained; in the trajectory updating stage, the user movement trajectory generated in the trajectory modeling stage is corrected. The density clustering-based self-adaptive trajectory prediction method is used for user movement trajectory prediction in the mobile communication scene; furthermore, when the new user movement trajectory is generated, the whole trajectory data is not needed to be modeled again.
Owner:XIAN UNIV OF TECH

Network security situational awareness method

The invention discloses a network security situational awareness method in the technical field of information security, which comprises the steps of: acquiring data from security defect software and / or hardware, preprocessing data, and using the preprocessed data as data samples; carrying out characteristic extraction and dimension reduction on the data samples by using manifold learning to obtain output values of the data samples; clustering the output value of the data samples by using a core matching integration clustering algorithm; fusing the clustered results by adopting DS (Data Set) evidential reasoning; estimating network security situation and threat by adopting a hierarchical model; predicting network security situation in a set future time length by using historical data and the current network security situation; and judging that the network security is threatened according to a set threshold. According to the invention, the real time and the accuracy of the network security situational awareness are enhanced.
Owner:NORTH CHINA ELECTRIC POWER UNIV (BAODING)

Dynamic city zoning for understanding passenger travel demand

A system and method for dynamic zoning are provided. Travel demand data is received for a network which includes a set of points. The travel demand data includes values representing demand from each point to each of other point. Destination-distance values are computed which reflect the similarity between points in a respective pair, based on the travel demand data. For each pair of the points, a geo-distance value is generated which reflects the distance between locations of the points in the pair. An aggregated affinity matrix is formed by aggregating the computed geo-distance values and destination-distance values. The aggregated affinity matrix is used by a clustering algorithm to assign each of the points in the set to a respective one of a set of clusters. A representation of the clusters can be generated in which each of a set of zones encompasses the points assigned to its respective cluster.
Owner:XEROX CORP

Massive clustering of discrete distributions

The trend of analyzing big data in artificial intelligence requires more scalable machine learning algorithms, among which clustering is a fundamental and arguably the most widely applied method. To extend the applications of regular vector-based clustering algorithms, the Discrete Distribution (D2) clustering algorithm has been developed for clustering bags of weighted vectors which are well adopted in many emerging machine learning applications. The high computational complexity of D2-clustering limits its impact in solving massive learning problems. Here we present a parallel D2-clustering algorithm with substantially improved scalability. We develop a hierarchical structure for parallel computing in order to achieve a balance between the individual-node computation and the integration process of the algorithm. The parallel algorithm achieves significant speed-up with minor accuracy loss.
Owner:PENN STATE RES FOUND

Pedestrian identification method under road traffic environment based on improved YOLOv3.

InactiveCN109325418AVerify the recognition effectSolve the problem of difficult and slow target detectionBiometric pattern recognitionCluster algorithmRoad traffic
The invention discloses a pedestrian identification method under a road traffic environment based on improved YOLOv3. The method comprises the following steps of: S1, acquiring and pre-processing an image, and making a pedestrian sample set; 2, calculating the length-width ratio of the pedestrian candidate frames by using a clustering algorithm and the training set; 3, inputting the training set into the YOLOv3 network for multi-task training and saving the trained weight file; S4, inputting a picture to be recognized into the YOLOv3 network to obtain a multi-scale characteristic map; S5, using a logistic function to activate the x, y, confidence degree and category probability of the network prediction, and obtaining the coordinates, confidence degree and category probability of all prediction frames by judging the threshold value; S6, generating a final target detection frame and a recognition result by carrying out the non-maximum value suppression processing on the above result. The method of the invention solves the problem of low detection accuracy of the prior method, realizes the multi-task training, does not need additional storage space, and is high in detection accuracyand fast in speed.
Owner:SOUTH CHINA UNIV OF TECH

Cluster-based management of collections of items

Computer-implemented processes are disclosed for clustering items and improving the utility of item recommendations. One process involves applying a clustering algorithm to a user's collection of items. Information about the resulting clusters is then used to select items to use as recommendation sources. Another process involves displaying the clusters of items to the user via a collection management interface that enables the user to attach cluster-level metadata, such as by rating or tagging entire clusters of items. The resulting metadata may be used to improve the recommendations generated by a recommendation engine. Another process involves forming clusters of items in which a user has indicated a lack of interest, and using these clusters to filter the output of a recommendation engine. Yet another process involves applying a clustering algorithm to the output of a recommendation engine to arrange the recommended items into cluster-based categories for presentation to the user.
Owner:AMAZON TECH INC

Task history user interface using a clustering algorithm

The aspects of the disclosed embodiments include clustering a set of discrete user interface states into groups; presenting the groups on a display of a device; and enabling selection of any state within a presented group, wherein selection of a state returns the user interface to the selected state.
Owner:NOKIA TECHNOLOGLES OY

Big-data-based method and system for establishing and analyzing e-commerce user portrait of mobile terminal

ActiveCN108021929AAccurate understanding of behavioral preferencesHuman Data MiningCharacter and pattern recognitionMarketingCluster algorithmFeature extraction
The invention discloses a big-data-based method and system for establishing and analyzing an e-commerce user portrait of a mobile terminal. The method comprises: offline data of a user are obtained; according to an identification code, data of different data sources are integrated into an offline knowledge base; pretreatment including normalization, discretization and attribute reduction is carried on the offline data; feature extraction is carried out on the offline data based on a customized tag rule and a basic tag of the user is constructed; weight and time attenuation factor processing iscarried out on the tag data and a user portrait offline prediction model based on a QPS cluster algorithm is established; data clustering mining is carried out on the offline knowledge base by usingthe prediction model to obtain an e-commerce user portrait of a mobile terminal; and distributed processing is carried out on online behavior data and then the processed data are integrated with the offline model. Therefore, massive data of the e-commerce transaction of the mobile terminal are analyzed in a big data environment; the real-time user behavior can be analyzed quickly and real-time image fusion is realized; and a multi-dimensional user portrait is built, so that the e-commerce user is analyzed comprehensively.
Owner:SOUTH CHINA UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products