Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Quick massive-picture deduplication method

A picture, massive technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve problems such as image deletion errors, and achieve the effect of improving efficiency and rapid extraction

Inactive Publication Date: 2018-09-28
杨晓春
View PDF4 Cites 12 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, when the high-dimensional image features are mapped to low-dimensional hash buckets to form m-dimensional hash features, the high-dimensional image features are very sparse compared to the m-dimensional Hamming space. If only Sparse image low-dimensional features are used as the extracted image features and used as the features of the image itself, which will cause a large number of different images to appear in the same hash bucket, making it necessary to take additional measures for images in the same hash bucket. Multiple pairwise precise feature comparisons can finally search for duplicate images, and different images may have the same hash code, resulting in image deletion errors

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Quick massive-picture deduplication method
  • Quick massive-picture deduplication method
  • Quick massive-picture deduplication method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0022] The present invention will be further described below in conjunction with specific embodiments. The exemplary embodiments and descriptions of the present invention are used to explain the present invention, but not as a limitation to the present invention.

[0023] Such as figure 1 As shown, a fast method for deduplication of massive pictures in this embodiment includes the following steps:

[0024] Step 1, generate image fingerprint information for each picture in the picture to be deduplicated by perceptual hashing algorithm;

[0025] Step 2, use multiple groups of random hash maps to construct an image hash feature dictionary, so as to remove duplicate pictures.

[0026] The perceptual hashing algorithm generates image fingerprint information by constructing low-dimensional features of the image, and uses it as a global description of the image for image comparison to search for similar images. It is easy to understand through the study of perceptual hashing algori...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a quick massive-picture deduplication method. Each of to-be-deduplicated pictures is used to generate image fingerprint information through a perceptual hash algorithm; and multiple sets of random hash mapping are employed to construct an image hash feature dictionary, and thus duplicate pictures are removed. Compared with the prior art, the method generates image fingerprints through constructing low-dimensional image features, constructs the image hash feature dictionary, thus carries out quick deduplication of the massive images, completely removes image feature comparison consuming more time, alleviates the mapping space sparse problem caused by the low-dimensional image features and local-sensitive hashes through designing multiple reasonable hash mapping, canquickly extract the image features, and position the duplicate images, also reduces feature comparison frequency of the images to 0, and greatly improves deduplication efficiency of the massive imagesin a case of ensuring accuracy.

Description

technical field [0001] The invention relates to the technical field of picture deduplication processing, in particular to a fast method for deduplication of massive pictures. Background technique [0002] The existing duplicate image removal method first extracts its color, texture, shape and other features from each image in the image data set, and then uses the similarity of the features to measure the similarity between images, thereby achieving the purpose of removing duplicate images . But once the number of images increases to a certain scale, the time consumption required to perform pairwise feature comparison on the images is very huge and unacceptable. However, for a large amount of image data, most of the images in the image collection have no relationship, so the pairwise feature comparison not only does not contribute to the deduplication of the image, but also consumes a lot of computing time. This makes deduplication of massive images inefficient and time-con...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30G06K9/62
CPCG06F18/2136G06F18/22
Inventor 杨晓春王斌王晓琼
Owner 杨晓春
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products