Patents
Literature
Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.
Hiro

50 results about "Pairwise similarity" patented technology

Pairwise similarity score provides a relevant measure of similarity between protein sequences. This similarity incorporates biological knowledge about proteins and it is extremely powerful when combined with support vector machine to predict PPI.

Adjustment of document relationship graphs

Provided is a process of modifying semantic similarity graphs representative of pair-wise similarity between documents in a corpus, the method comprising obtaining a semantic similarity graph that comprises more than 500 nodes and more than 1000 weighted edges, each node representing a document of a corpus, and each edge weight indicating an amount of similarity between a pair of documents corresponding to the respective nodes connected by the respective edge; obtaining an n-gram indicating that edge weights affected by the n-gram are to be increased or decreased; expanding the n-gram to produce a set of expansion n-grams; adjusting edge weights of edges between pairs of documents in which members of the expanded n-gram set co-occur.
Owner:QUID LLC

Model selection for cluster data analysis

A model selection method is provided for choosing the number of clusters, or more generally the parameters of a clustering algorithm. The algorithm is based on comparing the similarity between pairs of clustering runs on sub-samples or other perturbations of the data. High pairwise similarities show that the clustering represents a stable pattern in the data. The method is applicable to any clustering algorithm, and can also detect lack of structure. We show results on artificial and real data using a hierarchical clustering algorithm.
Owner:HEALTH DISCOVERY CORP +1

Method and System for Clustering Data Points

Systems and methods for clustering a group of data points based on a measure of similarity between each pair of data points in the group are provided. A pairwise similarity function can be estimated for each pair of data points in the group. A clustering algorithm can be executed to create clusters and associate data points with the clusters using the pairwise similarity function. The algorithm can be iterated multiple times until a stopping condition is reached in order to reduce variance in the output of the algorithm. The pairwise similarity function for each pair of data points can be updated between iterations of the algorithm and the results of each iteration can be aggregated. The data in each data point associated with a cluster can be consolidated into a consolidated data point.
Owner:GOOGLE LLC

Chinese Web document online clustering method based on common substrings

The invention discloses a Chinese Web document online clustering method based on common substrings. As known to all, search engines are important in application of information searching and positioning with sharp increase of information on the internet. Web document clustering can automatically classify return results of the search engines according to different themes so as to assist users to reduce query range and fast position needed information. The Web document online clustering is characterized in that non-numerical and non-structured characteristics of Web documents are required to be met on the one hand, and clustering time is required to meet online search requirements of users on the other hand. According to the two characteristics, the invention provides the Chinese Web document online clustering method based on common substrings, and the method comprises steps as follows: (1) firstly, preprocessing the first n query results returned by the search engines so as to realize deleting and replacing operation of non-Chinese characters in the return results of the search engines, (2) extracting common substrings in the Web documents by utilizing GSA, (3) presenting a weighting calculation formula referring to TF*IDF according to the common substrings which are extracted and then building a document characteristic vector model, (4) computing pairwise similarity of the Web documents on the basis of the model to acquire a similarity matrix, (5) adopting an improved hierarchical clustering algorithm to achieve clustering of the Web documents on the basis of the matrix, and (6) executing clustering description and label extraction. The Chinese Web document online clustering method based on common substrings has obvious advantages on performance, clustering label generation and clustering time effects.
Owner:BEIHANG UNIV

Method of identifying outliers in item categories

A system and method of identifying outliers in item categories are described. A pairwise similarity measurement may be determined between each item listing in a plurality of item listings based on a comparison of at least one feature of each item listing. At least one outlier among the plurality of item listings may be determined using the pairwise similarity measurements. The feature(s) may comprise at least one feature from a group of features consisting of: a title, an image, a price, an attribute, and a description. Each item listing in the plurality of item listings may belong to the same leaf or non-leaf category in a network-based marketplace or publication system. The outlier(s) may be determined using at least one clustering algorithm. The clustering algorithm(s) may comprise an agglomerative hierarchical clustering algorithm and / or a density-based clustering algorithm.
Owner:EBAY INC

Model selection for cluster data analysis

A model selection method is provided for choosing the number of clusters, or more generally the parameters of a clustering algorithm. The algorithm is based on comparing the similarity between pairs of clustering runs on sub-samples or other perturbations of the data. High pairwise similarities show that the clustering represents a stable pattern in the data. The method is applicable to any clustering algorithm, and can also detect lack of structure. We show results on artificial and real data using a hierarchical clustering algorithm.
Owner:HEALTH DISCOVERY CORP +1

Level set SAR (Synthetic Aperture Radar) image segmentation method based on self-adaptive finite element

The invention discloses a level set SAR (Synthetic Aperture Radar) image segmentation method based on a self-adaptive finite element, which is mainly used for solving the problem that a conventional variational level set model based on statistical distribution is imprecise in the non-homogeneous SAR image segmentation. The method comprises the concrete implementation steps of: (1) optimizing an image partitioning energy term on the basis of minimum cutset criterion of image partitioning; (2) defining the weighted energy functional through combining with a level set rule term and a length bound term; (3) carrying out variation and minimization on the energy functional to obtain a curve evolution control equation; (4) carrying out discretization on a finite element mesh to obtain a semi-implicit discrete scheme of the curve evolution control equation; and (5) adjusting strategy by adopting the self-adaptive finite element mesh based on posteriori error estimate, realizing the level set evolution based on a triangular mesh and obtaining a segmentation result of the SAR image. According to the invention, the energy functional is defined by utilizing pairing similarity so that the limitation of the conventional statistical model is overcome; in the meantime, the numerical computation strategy based on the self-adaptive finite element is adopted so that the effective balance of segmentation quality and computing efficiency is realized.
Owner:ZHEJIANG GONGSHANG UNIVERSITY

Techniques for mixed-initiative visualization of data

In various embodiments, a visualization engine generates graphs that facilitate sense making operations on data sets. A graph includes nodes that are associated with a data set and edges that represent relationships between the nodes. In operation, the visualization engine computes pairwise similarities between the nodes. Subsequently, the visualization engine computes a layout for the graph based on the pairwise similarities and user-specified constraints. Finally, the visualization engine renders a graph for display based on the layout, the nodes, and the edges. Advantageously, by interactively specifying constraints and then inspecting the topology of the automatically generated graph, the user may efficiently explore salient aspects of the data set.
Owner:AUTODESK INC

Attribute graph literature clustering method based on graph convolutional neural network

The invention discloses an attribute graph literature clustering method based on a graph convolutional neural network, and belongs to the field of graph data mining. Specifically, literature attribute graph feature learning is carried out by using a cross-layer linked graph convolutional neural network; estimating an optimal cluster number from the node features by using a deep clustering estimation model; alternately executing the two steps to complete training; utilizing the trained model to obtain the characteristics of all to-be-clustered literature attribute graph nodes and the estimated number of clustering clusters; and taking the characteristics and the estimated number of the clustering clusters as input, and obtaining a clustering result of the literature attribute graph by using a k-means clustering method. When a cross-layer linked graph convolutional neural network is trained, a self-separation regularization item based on node pairwise similarity is adopted, so that the characteristics of nodes in the same cluster are similar and the characteristics of nodes in different clusters are far away, and the performance of graph clustering is effectively improved. And the clustering estimation module realizes data-driven clustering cluster number estimation, so that the whole system is more suitable for a real data environment without labels.
Owner:BEIJING UNIV OF TECH

Image text cross-modal retrieval method based on category information alignment

ActiveCN113010700AEliminate heterogeneity differencesInvariance guarantees that the learned representations have bothMetadata multimedia retrievalImage codingMachine learningPairwise similarity
The invention discloses an image text cross-modal retrieval method based on category information alignment, and aims to keep distinguishing between different semantic category instances (image texts) and eliminate isomerism differences. In order to achieve the purpose, category information is innovatively introduced into a public representation space, namely an image text public space to minimize distinguishing loss, and cross-modal loss is introduced to align different modal information. In addition, a category information embedding method is adopted to generate false features instead of other methods marking information based on DNN; at the same time, modal invariance loss is minimized in a category public space to learn modal invariance features. Under the guidance of the learning strategy, pairwise similarity semantic information of image-text coupling items is fully utilized as much as possible, and it is guaranteed that learned representation has both the discrimination of a semantic structure and the cross-modal invariance.
Owner:UNIV OF ELECTRONIC SCI & TECH OF CHINA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products