Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Quick large-scale high-dimensional data retrieval method and system

A high-dimensional data and data technology, applied in the field of large-scale high-dimensional data rapid retrieval methods and systems, can solve the problems of low retrieval accuracy and efficiency, and achieve the effects of improving accuracy, time efficiency, and quantification.

Active Publication Date: 2018-04-20
TSINGHUA UNIV
View PDF8 Cites 11 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, in the existing large-scale high-latitude data search and retrieval methods, high-dimensional data can only be compressed to a certain extent, but cannot be effectively pruned, so that the retrieval accuracy and efficiency of retrieval are not high

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Quick large-scale high-dimensional data retrieval method and system
  • Quick large-scale high-dimensional data retrieval method and system
  • Quick large-scale high-dimensional data retrieval method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0041] The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

[0042] like figure 1 As shown, an embodiment of the present invention provides a large-scale high-dimensional data fast retrieval method, including:

[0043] S1, based on the trained product quantization unit, obtain the binary code corresponding to the data to be retrieved, and the binary code is used to determine the cluster center closest to the data to be retrieved;

[0044] S2. Input the binary code into the multiple inverted index unit matched with the trained product quantization unit, and obtain a set of data in the preset database with the smallest distance from the data to be retrieved;

[0045] S3. Sort all the data in the set according to the distance between...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and system for approximate nearest neighbor retrieval of large-scale high-dimensional data based on product quantization and multi-reverse indexing. The method comprises the steps that binary codes corresponding to data to be retrieved are obtained based on a trained product quantization unit, wherein the binary codes are used for determining a clustering center nearest to the data to be retrieved; the binary codes are input into a multi-reverse indexing unit matched with the trained product quantization unit, and a set composed of data nearest to the data to beretrieved in a preset database is obtained; according to the distance between each piece of data in the set and the data to be retrieved, all the data in the set is sorted, and all the sorted data serves as retrieval result. The large-scale similarity retrieval method and system based on high-dimensional data can greatly improve the retrieval accuracy and time efficiency.

Description

technical field [0001] The present invention relates to the technical field of computer data management, and more specifically, to a large-scale high-dimensional data fast retrieval method and system. Background technique [0002] With the rapid development of the Internet, large-scale high-dimensional data is becoming more and more common in search engines and social networks, and has attracted more and more attention. With the continuous increase of multimedia resources on the Internet, how to quickly and effectively find relevant data from large-scale high-dimensional data is a great test both in terms of time and space. [0003] In the prior art, the following method is usually used to realize the search and retrieval of large-scale high-latitude data, that is, step 1, using the initialization retrieval method to establish an initialization index for the high-dimensional database point set, and establish the nearest index of the high-dimensional database point set Neigh...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/3331G06F16/35
Inventor 王建民龙明盛曹越刘斌
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products