Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Large-scale image data similarity searching method based on EMD (earth mover's distance)

An EMD distance and similarity search technology, applied in electrical digital data processing, special data processing applications, instruments, etc., can solve problems such as large workload of manual annotation, unfavorable computing load of computing nodes, and unsatisfactory scalability performance.

Active Publication Date: 2015-06-03
GUANGXI UNIV
View PDF3 Cites 15 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] To find valuable image information from large-scale image datasets, the traditional text-based image retrieval method (Text-Based Image Retrieval, TBIR for short) obviously cannot meet the demand
Because TBIR technology relies on manual annotation of image content, when the number of images increases sharply, it brings two serious problems: first, the workload of manual annotation is too large, and the cost of annotation is too high; second, the subjective If it is too strong, it will directly affect the reliability of image retrieval results.
In view of the high computational complexity of the EMD distance, it is obviously biased to estimate the calculation cost of the computing node by the amount of data (rather than the actual number of EMD distance calculations), which is not conducive to balancing the computing load of each computing node and directly reduces the overall Query Processing Performance in Distributed Systems
On the other hand, when the size of the image data set surges, the filtering performance of the distributed index in Melody-Join for irrelevant calculations is still insufficient.
The above two aspects directly lead to the fact that the scalable performance of Melody-Join in processing large-scale image datasets cannot meet the needs of practical applications.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Large-scale image data similarity searching method based on EMD (earth mover's distance)
  • Large-scale image data similarity searching method based on EMD (earth mover's distance)
  • Large-scale image data similarity searching method based on EMD (earth mover's distance)

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0061] Such as image 3 As shown, the steps of the large-scale image data similarity search method based on the EMD distance of this embodiment include:

[0062] 1) An image data mapping function f designed to map image data to a one-dimensional real key value space Ω(Φ). The image data mapping function f includes image data and one of the keys in the one-dimensional real key value space Ω(Φ). Mapping relationship between;

[0063] 2) Start a MapReduce job MR1, and use the MapReduce job MR1 to estimate the query processing load corresponding to each key value in the one-dimensional real key value space Ω(Φ) based on the query image set Q and the image set I to be retrieved;

[0064] 3) Start a MapReduce job MR2, and use the Map task of MapReduce job MR2 to cut the one-dimensional real key value space Ω(Φ) based on the query processing load estimated in step 2), and divide the one-dimensional real key value space Ω( Φ) The image data fragments in the query image set Q corresponding t...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a large-scale image data similarity searching method based on an EMD (earth mover's distance). The method comprises the following steps that an image data mapping function f used for mapping to a one-dimension real number key value space Omega(phi) is designed; an operation MR1 is started, and a load of each key value in the Omega(phi) is estimated; the operation MR2 is started, the cutting is carried out on the Omega(phi) through a Map task on the basis of the estimated key value load, and data corresponding to the cutting region are sent to a Reduce task in a segmented way; image data received by each Reduce task is mapped to the key values in the Omega(phi) on the basis of f, and an index structure oriented to the EMD is built on the basis of the key values; the similarity searching based on the EMD is executed on the basis of the index structure; execution results of each Reduce task based on EMD similarity searching in the MR2 are subjected to union set taking and output. The large-scale image data similarity searching method has the advantages that the network transmission data quantity is lower, the calculation load distribution is more balanced, the similarity searching efficiency is higher, and the big data set analysis and processing expandability is better.

Description

Technical field [0001] The invention relates to a similarity search technology for computer image data, in particular to a large-scale image data similarity search method based on EMD distance. Background technique [0002] With the popularization of digital devices such as portable computers, smart phones, and digital cameras, multimedia data represented by images has increased and exploded. All this indicates that the era of image big data has arrived. At present, academia, industry and even government agencies have begun to pay close attention to the analysis and processing of image big data. [0003] To find important value image information from large-scale image data sets, the traditional text-based image retrieval method (Text-Based Image Retrieval, referred to as TBIR) obviously cannot meet the demand. Because TBIR technology relies on manual annotation of image content, when the number of images increases sharply, it brings two serious problems: first, the workload of ma...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
Inventor 许嘉吕品李陶深陈宁江许华杰文珺张佳振
Owner GUANGXI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products