Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

High-accuracy similarity search system

A retrieval system and similarity technology, applied in the field of data similar to unstructured data, to achieve the effect of improving accuracy

Inactive Publication Date: 2012-09-26
HITACHI LTD
View PDF3 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, in this method, when the calculation of the score of the non-critical data is stopped halfway, regardless of whether the score from the search data is smaller than the threshold value r, the reduction of the non-critical data is terminated without calculating the score (that is, the search is missed). From the point of view of the expected value of the number, that is, there is room for improvement in the point of view of retrieval accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-accuracy similarity search system
  • High-accuracy similarity search system
  • High-accuracy similarity search system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0064] Hereinafter, a first embodiment will be described with reference to the drawings.

[0065]The similar search system of this embodiment is a similar image search system in which a user inputs an image, and the system searches for similar images from a database in a search server terminal. Unstructured data such as animation, music, documents, and binary data can also be used instead of images. In the similarity search system of this embodiment, a color histogram is used as a feature quantity of an image, and the Euclidean distance between feature quantities is used as a score.

[0066] In the similarity search system of the present embodiment, M pieces of registered data are selected in advance as key data. As a selection method of key data, for example, there is a random selection method. Then, the scores of the remaining registration data (non-key data) and each key data are calculated, and the first index vector for retrieval is obtained for each non-key data. At t...

Embodiment 2

[0221] Hereinafter, a second embodiment will be described with reference to the drawings. The similarity search system of the present embodiment is a biometric system in which a user who is trying to authenticate (hereinafter referred to as an authentication user) inputs biometric information, and the system retrieves similar biometric information from a database in the client terminal, thereby identifying that the authenticated user is in the database. Which one (or no one) of the users registered in (hereinafter referred to as registered users) is authenticated according to the result.

[0222] Figure 7 A configuration example of the biometrics system of this embodiment is shown in . Here, the narrative and figure 1 different points. In this embodiment, raw data is biological information.

[0223] The system is composed of the following parts: the registration terminal 100 that sends the feature quantity of the biological information obtained from the user to the server...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a high-accuracy similarity search system. A pivot is determined from enrolled data by a pivot determination unit, raw data is acquired, features are extracted from the raw data, a score is calculated as one of a distance and a degree of similarity between the features, an index vector is generated by using the score for the pivot, a score is calculated as one of a distance and a degree of similarity between the index vectors, a parameter of each non-pivot including a regression coefficient is trained by using training data, order to select the non-pivots is, by using the score between search data and the non-pivot as well as the regression coefficient, determined in descending order of posterior probability through logistic regression, and a search result is outputted based on the score between the search data and the enrolled data.

Description

technical field [0001] The present invention relates to methods and systems for retrieving data similar to input unstructured data. Background technique [0002] Retrieving unstructured data similar to input unstructured data such as images, animations, music, documents, binary data, and biological information is called similarity retrieval. Generally, by extracting information called feature quantities used for distance calculation (or similarity calculation) from original unstructured data (hereinafter referred to as raw data), the smaller the distance indicating the degree of inconsistency between feature quantities ( Alternatively, if the degree of similarity indicating the degree of agreement between the feature quantities is large), the similarity is regarded as being more similar, and a similarity search is performed. The distance (or similarity) between feature quantities is called a score. [0003] For example, there is a method of calculating the distance (or sim...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F17/30946G06K9/62G06K9/6271G06F16/901G06F18/24133
Inventor 村上隆夫高桥健太
Owner HITACHI LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products