Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

List supervision based hash sorting method

A sorting method and data point technology, applied in the field of data processing, can solve the problems of inability to obtain data, dimension disaster, low storage cost, etc.

Active Publication Date: 2021-04-09
浙江实达实工业购科技有限公司
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

This method generally performs well in low-dimensional spaces. However, when the dimension rises sharply, problems such as "curse of dimensionality" will occur, and its search efficiency is close to that of linear search.
The second is the currently popular hash-based search method. The hash method has attracted more and more researchers' attention in recent years because of its better retrieval speed and lower storage cost. Its core idea is to The eigenvectors in the high-dimensional space are converted into low-dimensional binary strings in the Hamming space, and at the same time, the similarity between the original data can be preserved, but the problem is that after the data points are converted into binary codes, when searching, you need Compare the Hamming distance between the data point and the query point, and the Hamming distance is a discrete integer value. There may be multiple points with the same Hamming distance as the query point, so that the most similar data cannot be obtained
[0006] Although the hash method based on list supervision makes full use of the sorting information, when considering that the Hamming distance between all points in the database and the query point is equal, it is impossible to sort each point more accurately, resulting in sorting errors

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • List supervision based hash sorting method
  • List supervision based hash sorting method
  • List supervision based hash sorting method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] The present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments.

[0046] Such as figure 1 As shown, a hash sorting method based on list supervision is used to calculate the sorting information in the Hamming space between each data point in the data set X and any query point in the query set Q, and the data set X The data points of x are denoted as x i , denoting the query points in the query set Q as q j : Among them, i=1, 2, 3...N, j=1, 2, 3...M, N is the total number of data points in the data set X, M is the total number of query points in the query set Q number; including the following steps:

[0047] Step 1. Calculate the Euclidean distance between a certain query point in the query set Q and each data point in the data set X;

[0048] Step 2. Sort the Euclidean distance calculated in step 1 in ascending order, obtain the Euclidean distance sorting information between each data point in the data se...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to a hash sorting method based on list supervision, comprising: calculating the Euclidean distance between a query point and each data point; sorting according to the order of the Euclidean distance from small to large, and obtaining a sorted list of data points in Euclidean space ;Convert the query point and all data points into binary codes through the hash function; divide the binary codes of the query point and data points into sub-blocks of the same length, and have different weights for each sub-block, calculate the query point and each Hamming distance between data points; calculate sorting information according to Hamming distance, and get the sorting list of each data point in Hamming space; calculate all queries according to the sorting list of data points in Euclidean space and Hamming space Point the total loss function; optimize the total loss function to obtain the optimal weight of each sub-block; according to the optimal weight, obtain the optimal hash function. It can effectively sort the data in the Hamming space and improve the accuracy of the query.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a hash sorting method based on list supervision. Background technique [0002] As the most widely used technique in data retrieval, nearest neighbor query has always been a research hotspot. Generally speaking, the nearest neighbor query is to find the data most similar to the query object from the database according to the similarity between the data. It is widely used in information retrieval, pattern recognition, data mining and many other fields. However, with the continuous development of modern society and the updating of information technology, the data generated by people every day has shown explosive growth in terms of both volume and dimension. How to obtain the data you want most through the nearest neighbor query technology is a Problems that need to be solved urgently. [0003] At present, there are mainly two methods to solve the nearest neighbor query problem: one...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/2458G06F16/2457G06F16/22
Inventor 杨安邦钱江波寿震宇袁明汶
Owner 浙江实达实工业购科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products