How to Similarity query?

Patents

Literature

Hiro is an intelligent assistant for R&D personnel, combined with Patent DNA, to facilitate innovative research.

Hiro

80 results about "Similarity query" patented technology

Filter

Efficacy Topic

Property

Owner

Technical Advancement

Application Domain

Technology Topic

Technology Field Word

Patent Country/Region

Patent Type

Patent Status

Application Year

Inventor

Graph querying, graph motif mining and the discovery of clusters

ActiveUS20110173189A1Digital data information retrievalDigital data processing detailsSimilarity queryHistogram

A method for analyzing, querying, and mining graph databases using subgraph and similarity querying. An index structure, known as a closure tree, is defined for topological summarization of a set of graphs. In addition, a significance model is created in which the graphs are transformed into histograms of primitive components. Finally, connected substructures or clusters, comprising paths or trees, are detected in networks found in the graph databases using a random walk technique and a repeated random walk technique.

Graph querying, graph motif mining and the discovery of clusters

View all

Owner:RGT UNIV OF CALIFORNIA

Graph querying, graph motif mining and the discovery of clusters

ActiveUS8396884B2Digital data information retrievalDigital data processing detailsGraphicsSimilarity query

View all

Owner:RGT UNIV OF CALIFORNIA

Server-side personal privacy data protecting method in network information system

ActiveCN103973668ASolve the problem of personal privacy protectionFlexible query performanceTransmissionSpecial data processing applicationsSimilarity queryInternet privacy

The invention discloses a server-side personal privacy data protecting method in a network information system. The server-side personal privacy data protecting method provided in the network information system is capable of supporting various types of common text query, high query performance and high safety. By paving intermediate software between a client side and a server side of the network information system to implement the provided server-side personal privacy data protecting method to realize two functions: one is storing personal privacy data inputted by the client side of the system into a background database of the server side of the system after the personal privacy data is encrypted and guaranteeing safety of personal privacy information at an incredible server side; the other is building proper indexes for the personal privacy data so as to support common text queries including accurate query, similarity query, range query and the like and guaranteeing efficiency of cryptograph query.

Server-side personal privacy data protecting method in network information system

View all

Owner:上海静客网络科技有限公司

Systems and methods for privacy-assured similarity joins over encrypted datasets

ActiveUS20180157703A1Strong data protectionReduce query latencyMemory architecture accessing/allocationMultiple keys/algorithms usageData setSimilarity query

Systems and methods which provide secure queries with respect to encrypted datasets are described. Embodiments provide privacy-assured similarity join techniques operable with large-scale encrypted datasets. A privacy-assured similarity join technique of embodiments enables a storage system to answer similarity join queries without learning the content of the query dataset and the target dataset. One or more secure query schemes may be implemented in accordance with a privacy-assured similarity join technique herein. For example, embodiments may utilize an individual similarity query scheme, a frequency hiding query scheme, and / or a result sharing query scheme. A particular secure query scheme of the foregoing secure query schemes may be utilized to address different considerations with respect to security, efficiency, and deployability with respect to various applications and scenarios with different requirements.

Systems and methods for privacy-assured similarity joins over encrypted datasets

View all

Owner:CITY UNIVERSITY OF HONG KONG

Indexing and retrieval system based on HBase-ORM (Object Relational Mapping)

ActiveCN106202207ASimplify the modification processThe removal process is simpleOther databases indexingSpecial data processing applicationsSimilarity queryFuzzy query

The invention provides an indexing and retrieval system based on HBase-ORM (Object Relational Mapping). Insertion, reading and modification of data are finished by automatically establishing a mapping relation between an underlying database table and an upper database object; a database layer is separated from a data access layer, so that upper developers can pay more attention to upper service logical processing, and the development efficiency is improved; meanwhile, the error rate is reduced; an index of each line of data in HBase is established according to different types by using Elastic search, and fuzzy query of texts, interval query of values, range query of longitude and latitude and similarity query of images are realized, so that real-time query demands of Web users on different data types are met.

View all

Owner:THE 28TH RES INST OF CHINA ELECTRONICS TECH GROUP CORP

Wind turbine generator system state prediction method for carrying out similarity search on basis of history data

InactiveCN106779200AImplement security assessmentForecastingInformation technology support systemCluster algorithmElectricity

The invention provides a wind turbine generation system state prediction method for carrying out similarity search on basis of history data, and relates to the technical field of wind turbine generation system state monitoring. The method comprises the following steps of: carrying out fan attribute selection and dimensionality reduction after preprocessing history data; carrying out clustering analysis on the dimensionality reduced data through an improved K-mean clustering algorithm; and carrying out history data similarity query to predict a fan operation state. According to the wind turbine generation system state prediction method for carrying out similarity search on basis of history data, a database can be established through history operation data of the wind turbine generator system, and the operation data can be compared with the history data of the fan and similar fans in real time so as to assess the state of the wind turbine generator system.

Wind turbine generator system state prediction method for carrying out similarity search on basis of history data

View all

Owner:NORTHEASTERN UNIV

Time-series similarity measurement method based on segmented statistical approximate representation

InactiveCN104462217AFully capture local fluctuation patternsImprove matching accuracySpecial data processing applicationsSimilarity queryPattern matching

The invention discloses a time-series similarity measurement method based on segmented statistical approximate representation. The method comprises the steps of feature extraction and dynamic pattern matching. First, a time series is segmented into sub series, the various statistical features of the sub series are sequentially extracted, and local pattern feature vectors are constructed; then the distance between the local pattern feature vectors is calculated by the weighted Euclidean distance, local pattern matching is achieved, the matched local pattern is used as the sub program of a dynamic programming algorithm, and global pattern matching is achieved. The method is superior to other measurement methods by a large degree on the aspects of measurement precision and calculation efficiency, and plays an important role in daily activities and industrial production of people, for example, financial transactions, traffic control, air quality and temperature monitoring, industrial flow monitoring, medical diagnosis and the like. Large scale sampling data or high-speed dynamic data flow is subjected to similarity-based search, classification, clustering, prediction, abnormal detection, on-line pattern recognition and the like.

View all

Owner:ZHEJIANG UNIV

Method and equipment for identifying sensitive image

ActiveCN102306287AReduce pressure on sensitive image recognitionReduce recognition pressureCharacter and pattern recognitionPattern recognitionApplication server

The invention aims to provide a method and equipment for identifying a sensitive image. The method for identifying the sensitive image comprises the following steps of: acquiring a reference sensitive image; establishing or updating a reference sensitive image library according to the reference sensitive image; acquiring an image to be processed, and identifying whether the image to be processed is a sensitive image or not; and performing similarity query in the reference sensitive image library according to the image to be processed to acquire a query result corresponding to the image to be processed. Compared with the prior art, the method and the equipment have the advantages that: repeated sensitive images in a network are effectively identified, the sensitive image identification pressure of an application server is reduced, and a network environment is purified simultaneously, so that netizens can obtain a better network surfing experience. The method further comprises the step of: correspondingly shielding the image to be processed and / or the source thereof according to the query result based on the reference sensitive image library.

Method and equipment for identifying sensitive image

View all

Owner:BAIDU ONLINE NETWORK TECH (BEIJIBG) CO LTD

Time series signifying method based on local feature cluster

InactiveCN103136327APreserve morphological propertiesConvenient researchSpecial data processing applicationsSimilarity queryAlgorithm

The invention provides a time series signifying method based on a local feature cluster. The time series signifying method based on the local feature cluster includes that original time series is read; a sliding window procedure is called, original time series is divided into multiple sub time series by utilizing of a sliding window; multiple slopes are adopted to show each sub time series of the original time series; a K mean value clustering algorithm is adopted to achieve clustering of the sub time series; and a corresponding sign identification is given to each clustering result. The time series signifying method based on the local feature cluster can well reduce dimensionality and keep form features of time series, is beneficial to further studying of the times series, and further solves the problems that similarity query, classification, clustering, mode digging and the like are directly conducted on the original time series in the prior, low storage and computational efficiency are caused, accuracy and reliability of an algorithm are effected and the like.

Time series signifying method based on local feature cluster

View all

Owner:CHINA UNIV OF MINING & TECH

Multi-dimensional index structure under cloud environment, construction method thereof and similarity query method

InactiveCN102831225ASave storage spaceReduce resource consumptionTransmissionSpecial data processing applicationsSimilarity queryResource consumption

The invention discloses a multi-dimensional index structure under a cloud environment, a construction method thereof and a similarity query method. The index structure disclosed by the invention comprises a global index and local indexes which are respectively positioned at all storage nodes, the cloud environment uses an overlay network to organize the storage nodes, and the local indexes are of clustering results obtained by clustering approximate vectors of all vector data in the storage nodes where the local indexes are located; and the global index is of information of clustering centers of all the local indexes, which are distributed to the whole overlay network and addresses of the storage nodes where the clustering centers are located. The index structure disclosed by the invention has the advantages of reducing index storage space, reducing resource consumption, effectively supporting multi-dimensional data index and similarity query under the cloud environment, using the clustering information obtained by clustering all the approximate vectors as the local indexes and improving query efficiency by only performing query on corresponding categories through the information of the clustering centers without scanning all the approximate vectors during the query of the local indexes.

Multi-dimensional index structure under cloud environment, construction method thereof and similarity query method

View all

Owner:NANJING UNIV OF POSTS & TELECOMM

Anomaly detection method based on data incremental graphs

InactiveCN103546916AReduce false alarm rateNetwork topologiesData setEvent type

The invention discloses an anomaly detection method based on data incremental graphs. The anomaly detection method includes the following steps that detection data in a current monitoring zone of a wireless sensor network are collected and preprocessed, and an event zone is determined; data sets relevant to a current event are acquired, a graph model is utilized to abstractly generalize event data, and the event data are converted into the event data incremental graphs; a graph similarity algorithm based on structure correlation is utilized to search an event mode graph database for event mode graphs similar to the event graphs and judge the type of the current event, wherein the event mode graph database is a set of the event mode graphs; the event mode graphs are the event data incremental graphs and abstract description for types of events; by the adoption of the graph similarity query algorithm based on the structure correlation, the graph similarity query problem is converted into the sequence similarity query problem, and therefore query complexity is effectively reduced. By the adoption of the anomaly detection method based on the data incremental graphs, the event graphs can be acquired based on domain expert knowledge or data analysis and used for detecting complex events, the detection efficiency of the events is improved, and the false alarm rate is reduced.

View all

Owner:SOUTHEAST UNIV

Time sequence similarity measurement method based on self-adaptive piecewise statistical approximation

InactiveCN104820673AEfficient identificationComplete Volatility TrendSpecial data processing applicationsSimilarity queryPattern matching

The invention discloses a time sequence similarity measurement method based on self-adaptive piecewise statistical approximation. The method comprises the following steps of firstly, segmenting a time sequence into subsequences containing complete fluctuation trends based on time sequence coded identification turning points; secondly, extracting various statistical characteristics of each subsequence in sequence so as to configure local pattern character vectors; and lastly, computing a distance between the local pattern character vectors by utilizing a normalized distance so as to realize local pattern matching, and using the local pattern matching as a subprogram of a dynamic programming algorithm so as to realize global pattern matching. The time sequence similarity measurement method is better than the other measurement method in the aspects of measurement precision and computational efficiency to a larger extent, and plays an important role in daily activities and industrial production of people, such as similarity search, classification, clustering, predication, anomaly detection, on-line pattern recognition and the other processing of large-scale sampling data or high-speed dynamic data flow in banking transaction, traffic control, air quality and temperature monitoring, industrial flow monitoring, medical diagnosis and the other application.

View all

Owner:ZHEJIANG UNIV

High dimension data index method based on maximum clearance space mappings

InactiveCN101266607AImprove query performanceImprove efficiencySpecial data processing applicationsVisit timeSimilarity query

The invention relates to a high dimensional data index method based on maximum clearance space mapping, and belongs to the database field, comprising following steps: a step 1 of processing the maximum clearance space mapping to calculate each dimensional clearance value of a given data space, and selecting values before K with larger dimensional clearance values, and projecting actual data points of the given data space into K dimensional spaces; a step 2 of manufacturing MS-tree Ms-tree, namely firstly finding a suitable knot insertion M, wherein, if the knot insertion is not full, the object is directly inserted into the knot insertion, and if the knot insertion is full, the knot insertion is broken up, then checking if the insert object in MBR of the knot insertion M or not, wherein, if not, then updating the MBR of the knot insertion M and mapping original space into a low dimensional space; a step 3 of processing a similarity query. The invention has an advantage of improving query performance via reducing visit of false activity subtree, so as to reduce visit times of the false activity subtree to improve the performance of index similarity query.

High dimension data index method based on maximum clearance space mappings

View all

Owner:NORTHEASTERN UNIV

Method for calculating similarity of mobile applications based on content

ActiveCN105677695AImprove accuracyIncrease success rateSpecial data processing applicationsDatabase design/maintainanceData warehouseSimilarity query

The invention relates to a method for calculating similarity of mobile applications based on content.The method comprises following steps: extracting information of mobile applications after acquiring a large amount of information of mobile applications including application names, application types, application descriptions and application sizes; carrying out word segmentation on application description information; dividing content into two parts after word segmentation is finished with one part as a training corpus for a word 2vec model and the other part being stored in the form of file sets and subjected to TF-IDF calculations and then stored into an HBase data warehouse; and inquiring similarity of applications and calculating. The method for calculating similarity of mobile applications based on content has following beneficial effects: the method is capable of rapidly responding to similarity query of apps; by app features based on content and description information, apps can be well referred; high accuracy is obtained; and accuracy of app query and recommendation can be increased.

View all

Owner:HANGZHOU YUANCHENG TECH CO LTD

Long sequence data dimensionality reduction method used for approximate query

InactiveCN101196921AComplete efficientlyEfficient Approximation Computational ProblemsSpecial data processing applicationsSocial benefitsSimilarity query

A similarity query-oriented long sequence data dimension reduction method is provided, which comprises adopting sequence embedding technology to convert sequence data into an embedded tree and extracting a plurality of sets; extracting a plurality of corresponding principal components based on the embedded tree and a plurality of sets and bringing forward distance convergence-based sequence data dimension reduction principle on the basis; on the basis of character of dimension reduction, constructing index structures facing sequence similarity query, SEM-tree and putting forward a high efficiency similarity query method facing long sequence data on the basis of the index structures and on the principle of sequence distance double bounds (maximum upper bound and minimum bound). The invention can be widely applied in the similarity query facing long sequence data, such as finding targets being searched through similarity search from a sea of Internet text data and carrying out similarity query and analysis on gene fragments from large-scale genetic data. The invention can forecast gaining obvious economic and social benefits.

Long sequence data dimensionality reduction method used for approximate query

View all

Owner:PEKING UNIV

Time sequence index based on trends

InactiveCN104216924AFast and efficient buildAccurate querySpecial data processing applicationsData dredgingSimilarity query

The invention provides an index mode based on change trends of time sequences (determined time sequences and undetermined time sequences) and first-order connectivity indexes, and the index mode has great significance to time sequence prediction, classification, data mining, knowledge discovery and the like. The index mode solves the problems of high data redundancy or low match accuracy and low index efficiency caused by the time sequence space indexes, precise query, similarity query, clustering and classification of the time sequences can be finished effectively through the index, and the time complexity and space complexity of sequence query, clustering and classification are lowered greatly. According to the index mode, firstly interval segmentation is conducted on the time sequences and time dimensions, short trend symbol sequences are generated in a mapping mode according to the change trends of the time sequences in all sections, then the first-order connectivity indexes of the section rising trend, section descending trend, section wave-crest trend, section wave-trough trend and section gentle trend are calculated for the symbol sequences, and finally a B-Tree index of a time sequence database is built by the adoption of the one-order indexes of the five trends.

View all

Owner:肖瑞

Similarity query system and method suitable for moving target branch track

ActiveCN110162586AImplementing a distance metricSolve the distance problemGeographical information databasesSpecial data processing applicationsSimilarity queryAlgorithm

The invention relates to a similarity query system and method suitable for a moving target branch track. The method aims to overcome the defect that similarity of branch tracks in a topological structure, a moving track and evolution duration is difficult to calculate in the prior art. The method includes: generating two data structures, namely a sequence graph and a path set by converting the track data; editing a distance, a path set PED distance and a path set DT distance through the defined sequence diagram DSG; measuring the similarity of the target track and the reference track in the aspects of an evolution structure, a moving path and an evolution time length; and carrying out overall similarity calculation on the target trajectory and the reference trajectory by integrating the three distance metrics, quickly and accurately querying and matching a branch trajectory most similar to a given moving target branch trajectory based on the similarity, and providing support for space-time trajectory mining and visualization and similar case reasoning.

Similarity query system and method suitable for moving target branch track

View all

Owner:INST OF GEOGRAPHICAL SCI & NATURAL RESOURCE RES CAS

Systems and methods for photon map querying

ActiveUS20100332523A1Digital data processing detailsMulti-dimensional databasesCode moduleSimilarity query

In one aspect, photon queries are answered using systems and methods of traversal of collections of photon queries through an acceleration structure, to identify photons meeting a specification of a given query. Such systems and methods can be extended to satisfying similarity queries in an n-dimensional parameter space. Queries can be associated with code (or pointers to code) that are run to achieve closure of that query. Queries can cause further queries to be emitted. Arbitrary data can be passed from one query to another; for example, parameters defined internally to the code modules themselves (e.g., the parameters do not need to have a definition or meaning to the systems or within the methods).

Systems and methods for photon map querying

View all

Owner:IMAGINATION TECH LTD

Mould CAD drawing query based on similarity query and management method

InactiveCN104408161ASimple methodImprove query accuracySpecial data processing applicationsSimilarity queryDegree of similarity

The invention discloses a mould CAD drawing query based on similarity query and a management method, according to the following steps: putting the drawing in storage in a batch mode, extracting the basic information of the CAD drawing and writing the basic information into the database in a batch mode; extracting the geometry moment of the section of the mould as the shaft characteristic; extracting the basic element characteristic of the midpoint, line and surface and the like of the drawing; taking the extracted characteristic as the similarity calculating parameter, defining the similarity percentage according to the requirement and contrasting the shape characteristic of different drawings, calculating the similarity of the different drawings according to the mathematic operation, thereby obtaining the final search result. The mould CAD drawing query based on similarity query and the management method are simple in method and high in query precision.

Mould CAD drawing query based on similarity query and management method

View all

Owner:周理

Large-scale trajectory data similarity query method based on multistage index structure

PendingCN113051359AImprove query efficiencyThe number of track points is smallRelational databasesCharacter and pattern recognitionSimilarity queryData set

The invention discloses a large-scale trajectory data similarity query method based on a multi-level index structure, and belongs to the field of urban traffic big data processing and application. The method is divided into an index establishment stage and a trajectory similarity query stage. In the index establishment stage, data preprocessing is firstly carried out on original trajectory data, a grid index is established for the trajectory data obtained after preprocessing based on a spatial grid index idea, and grid division is carried out on a trajectory data set through the grid index. Secondly, the feature information of each trajectory is represented by constructing a feature trajectory, a start-stop index is established for the start point and the end point of each trajectory in the space grid, and a feature point index is established according to the feature trajectory point of each trajectory; therefore, a feature trajectory formed by the trajectory points with the trajectory feature information is applied to the multi-level index structure. Finally, a multi-level index structure consisting of the grid index, the start-stop index and the feature point index is established.

Large-scale trajectory data similarity query method based on multistage index structure

View all

Owner:DALIAN UNIV OF TECH

Performance data dependency analyzing method and performance monitoring system

ActiveCN104933175ASimple structureEasy to operateWeb data indexingSpecial data processing applicationsGraphicsFeature vector

The invention is suitable for the performance data analyzing field and provides a performance data dependency analyzing method and a performance monitoring system. The method comprises: a step of collecting performance data of a plurality of indexes and storing the performance data to build a database; a step of extracting a performance data of one index in the database and building a time series according to a time collecting sequence; a step of extracting a graphic feature vector of the time series; a step of building a graphic feature index for the graphic feature vector; a step of building an object time series; a step of extracting a target image feature vector of the object time series; and a step of querying the graphic feature vector similar to the target graphic feature vector in the graphic feature index, and ranking and outputting the queried result. According to the invention, the method can realize big data analytics of the performance data via a time series similarity search technology, so that the variable characteristics on a timer shaft can be intuitively reflected and a manager can directly analyze the different factors of generating the performance problems via the queried result.

Performance data dependency analyzing method and performance monitoring system

View all

Owner:珠海金智维信息科技有限公司

A time series similarity searching method based on segmentation weight

ActiveCN109359135AImprove accuracyDigital data information retrievalSpecial data processing applicationsSimilarity queryAlgorithm

The invention discloses a time series similarity searching method based on segment weight, which comprises the following steps: (1) segmenting a query sequence q by adopting an important turning pointof the time series; (2) establishing piecewise Euclidean distance with weights as similarity measure of time series; (3) performing k-nearest neighbor similarity querying on the q, searching for thek most similar sequences of the query sequence q; (4) ending the query if the user is satisfied with the query result obtained in the step (3); If the user is not satisfied with the query result obtained in the step (3), marking the result and entering the step (5); (5) allowing the system to update the weights of the segments by using the sequence of user tags and returning to step (2). The invention automatically updates the weights through user feedback, adaptively learns the attention degree of the user to different segments, can improve the accuracy of similarity measurement, and furtheroptimizes the search result.

A time series similarity searching method based on segmentation weight

View all

Owner:HOHAI UNIV

Privacy-protected encrypted image retrieval method and system

ActiveCN112528064AImprove retrieval efficiencyHide similarity ranking informationStill image data indexingDigital data protectionFeature vectorSimilarity query

The invention discloses a privacy-protected encrypted image retrieval method and system, and the method comprises the specific implementation steps that an image owner extracts the feature vector of an image, encrypts the image and the feature vector, and uploads the encrypted image and the encrypted feature vector to a cloud server; when a query user searches, a query vector of a query image is extracted and encrypted, and a trap door is generated from the encrypted query vector and a set similarity query threshold and sent to the cloud server; and the cloud server retrieves the encrypted image set according to the query trap door and the index table and returns a retrieval result to the query user, and the query user decrypts the retrieval result to obtain a retrieval result. According to the method and system, based on the chaotic mapping image encryption algorithm, the convolutional neural network model is adopted to extract the image features, the image retrieval efficiency is improved, and finally, the cloud server only returns similar results within the threshold range and does not perform similar sorting, so that the security is further improved. The whole retrieval processis achieved in a ciphertext domain, and safe retrieval of the image is effectively achieved on the premise that stored data information and retrieval result related information are not leaked.

View all

Owner:XIDIAN UNIV

Method for similarity query of abnormal modes of flight data

InactiveCN102163236AReduce dimensionalitySmall amount of calculationSpecial data processing applicationsSimilarity queryNormal mode

The invention provides a method for similarity query of abnormal modes of flight data, which specifically comprises the step 1 of mode representation process; the step 2 of abnormal mode retrieving process; and the step 3 of similarity query process. By utilizing the mode representation technology based on important points, the method for inquiring the similarity of abnormal modes of flight data not only reduces the dimension of the flight data and reduces the calculation of similarity query, but also effectively removes noise interference in the flight data and improves the accuracy and the reliability of the similarity query. And the method for similarity query of abnormal modes of flight data, provided by the invention, achieves compact partitioning of the abnormal modes of flight databy utilizing competitive clustering technology, resulting in simple and concise index structure and improving the efficiency of the similarity query.

View all

Owner:BEIHANG UNIV

Time sequence data similarity query method based on memory calculation

ActiveCN108549696AReduce excess spaceReduce maintenance costsSpecial data processing applicationsLocal memoriesSimilarity query

The invention discloses a time sequence data similarity query method based on memory calculation, and belongs to the technical fields of distributed databases, memory calculation and information retrieval. According to the method, a cluster composed of distributed calculation nodes is used, data are stored through memory, and cluster calculation ability is expanded through expanding the distributed nodes; time sequence data are allocated to the calculation nodes, index resident memory is formed, and each calculation node is scheduled for searching after the cluster receives a searching request; partitioning and indexing of data of each node are all carried out in local memory, and communication can be carried out with the other nodes or overall external sub-modules; and data can be read atpartial nodes in a query process through guiding of an index residing in the memory without the need for scanning the entire cluster. For a time sequence given by a user at will, the method can quickly find out most of similar sequences from the cluster using the memory in a large scale for calculation.

View all

Owner:ANHUI UNIVERSITY OF TECHNOLOGY +1

High-dimensional data similarity connection inquiry method and device based on distance partition tree

PendingCN108829804AReduce complexityImprove query efficiencySpecial data processing applicationsSimilarity queryOne-dimensional space

The embodiment of the invention provides a high-dimensional data similarity connection inquiry method and device based on a distance partition tree. The method comprises the steps of: acquiring high-dimensional original data, and mapping the original data into a one-dimensional space; according to a first distance threshold and a chi square distribution property, determining a second distance threshold, and according to the original data and the second distance threshold, constructing the distance partition tree; traversing the distance partition tree and carrying out comparison on each node in the distance partition tree to obtain a candidate similar node pair set; and calculating an original distance between the original data included in each candidate similar node pair in the candidatesimilar node pair set, and carrying out comparison on each original distance and the first distance threshold to obtain a similarity inquiry result. The device is used for executing the method. According to the embodiment of the invention, complexity of calculation is reduced by mapping the high-dimensional original data to the one-dimensional space, candidate results can be found with low cost bythe distance partition tree, and a filtering effect is improved, so that inquiry efficiency is greatly improved.

High-dimensional data similarity connection inquiry method and device based on distance partition tree

View all

Owner:LUOYANG NORMAL UNIV

Similarity storage design method based on spectral hashing

ActiveCN105550208AAvoid memory-intensive problemsAvoid overloaded situationsDatabase distribution/replicationSpecial data processing applicationsSpectral hashingSimilarity query

The invention discloses a similarity storage design method based on spectral hashing. A spectral hashing method is combined with a data mapping algorithm, such that different kinds of high-dimensional data can be rapidly mapped in a distributed node space based on a distributed hash table. The hash table is distributed on a Chord ring to construct; simultaneously, corresponding hash buckets are mapped on the Chord ring by utilizing a designed novel data mapping algorithm; therefore, the more similar data in the two hash buckets are, the closer the two hash buckets on the Chord ring are; furthermore, the conception of virtual buckets is provided; each physical node server is regarded as one or more virtual buckets; the load of each virtual bucket is dynamically adjusted, such that the node on the Chord ring satisfies the load balance; the problem that the system query overhead is too high when the similarity query is carried out can be solved; and the data query efficiency is increased.

View all

Owner:NANJING UNIV OF POSTS & TELECOMM

High-dimensional data similarity join query method and device based on mapping space partition

PendingCN108846067AReduce in quantityReduce computational complexitySpecial data processing applicationsComputation complexityChi-squared distribution

Embodiments of the present invention provide a high-dimensional data similarity join query method and device based on mapping space partition. The method comprises: acquiring high-dimensional raw dataand mapping the raw data to one-dimensional space; determining a second distance threshold according to a first distance threshold and the chi-square distribution property, and dividing the one-dimensional space into a plurality of subspaces according to the second distance threshold; determining a number of the subspace corresponding to each raw data; obtaining a candidate data pair according tothe second distance threshold and the numbers of the subspaces; calculating an original distance of the candidate data pair and comparing the original distance with the first distance threshold to obtain a similarity query result. Device for performing a method is provided. As the high-dimensional raw data is mapped to the one-dimensional space, the raw data is divided in the one-dimensional space according to the second distance threshold, and then the similarity inquiry is carried out, and in this way, the computational complexity is lowered, the number of candidate results is reduced, andthe inquiry efficiency is improved.

High-dimensional data similarity join query method and device based on mapping space partition

View all

Owner:LUOYANG NORMAL UNIV

Measurement space data similarity query method and device based on SQL

ActiveCN107562872AImprove applicabilityImprove performanceSpecial data processing applicationsData setSimilarity query

The invention provides a measurement space data similarity query method and device based on an SQL. The method includes the steps that partitioning processing is performed on a data set, wherein eachpartition comprises data objects and a reference point; a first distance between each data object and the reference point in each partition is determined according to the reference point; an index structure of each data object is determined according to the first distances; a second distance between a query object and the reference point in each partition is determined according to the query object in a query request, a query range of the query object in each partition is determined according to the second distances and a preset distance threshold; a target data object corresponding to the query range in each partition is determined in the index structure of each data object. The method can achieve measurement space data similarity query on the basis of a database of the SQL technology soas to improve the applicability and performance of RDBMS database similarity query.

Measurement space data similarity query method and device based on SQL

View all

Owner:RENMIN UNIVERSITY OF CHINA

Gene data desensitization method for realizing efficient similarity query and access control

ActiveCN110263570AMeet individual query needsReduce in quantityKey distribution for secure communicationDigital data information retrievalPersonalizationAuthorization Mode

The invention belongs to the technical field of information security. The invention particularly provides a gene data desensitization method for realizing efficient similarity query and access control. The similarity query of large-scale gene data in a ciphertext environment is effectively supported; meanwhile, complex logic query is supported to meet the personalized query requirement of the user; according to the method, the authorization mode is flexible, different access authorities can be given to different data, reliable control over the data access authority of a user is achieved in the query process, in addition, a specific Hash function is adopted for compressing the data, the number of matched elements in the ciphertext state is remarkably reduced, and the query retrieval efficiency is further improved.

Gene data desensitization method for realizing efficient similarity query and access control

View all

Owner:UNIV OF ELECTRONICS SCI & TECH OF CHINA +1

80 results about "Similarity query" patented technology

Popular searches