Acquisition method and system of sample picture data set

A technology of data collection and sample pictures, applied in the field of data processing, can solve the problems of high subjectivity of screening results, error-prone screening results, wrong classification results, etc., to reduce labor and subjectivity, speed up training, and improve model accuracy Effect

Inactive Publication Date: 2018-11-23
SICHUAN FEIXUN INFORMATION TECH CO LTD
View PDF6 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The current industry practice is to use web crawlers to crawl a large amount of data, and then manually screen and classify all sample image data collections. The problem brought about by this processing method is that the workload is extremely huge, and the screening results are highly subjective. Error-prone
At the same time, using the wrong sample picture data set to train the neural network in the later stage will bring wrong classification results

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Acquisition method and system of sample picture data set
  • Acquisition method and system of sample picture data set
  • Acquisition method and system of sample picture data set

Examples

Experimental program
Comparison scheme
Effect test

no. 1 example

[0056] The first embodiment of the present invention, such as figure 1 Shown:

[0057] A method for obtaining a sample picture data set, comprising:

[0058] The cleaning process of the image data to be cleaned specifically includes:

[0059] Obtain a positive sample picture set and a negative sample picture set; the feature information of the picture data in the positive sample picture set is the same as the feature information of the target picture; the feature information of the picture data in the negative sample picture set is different from the feature information of the target picture same;

[0060] Specifically, the feature information includes, but is not limited to, picture features such as picture content and picture category. The pictures crawled from the Internet can be manually screened with a preset number of picture data that is the same as the feature information of the target picture as a positive sample picture set, and manually screened with a preset num...

no. 3 example

[0074] The third embodiment of the present invention. This embodiment is a preferred embodiment of the first embodiment. Compared with the first embodiment above, it is further optimized. According to the positive sample picture set and the negative sample picture set, training is obtained A neural network sorter consists of steps:

[0075] Delete the last fully connected layer of the pre-trained neural network model;

[0076] Add a fully connected layer and an activation layer in turn after deleting the last fully connected layer;

[0077] The neural network sorter is obtained by training a newly added fully connected layer and an activation layer according to the positive sample picture set and the negative sample picture set.

[0078] Specifically, the operation of the transfer learning training sorter is actually to delete the last fully connected layer of the pre-trained neural network model such as (mobilenetv1), and then add a fully connected layer and an activation l...

no. 4 example

[0079] The fourth embodiment of the present invention. This embodiment is a preferred embodiment of the first embodiment. Compared with the first embodiment above, it is further optimized. According to the neural network sorter, the picture data to be cleaned is classified to obtain several Confidence set consists of steps:

[0080] Input all image data to be cleaned into the neural network sorter to obtain the confidence level of each image data to be cleaned;

[0081] Classify the image data to be cleaned into corresponding confidence sets according to the confidence of the image data to be cleaned and the preset confidence interval division range.

[0082] Specifically, after training the neural network sorter, all the image data to be cleaned are input into the neural network sorter, and the neural network sorter is used to predict each image data to be cleaned to obtain the corresponding confidence. Confidence represents the probability that the image data to be cleaned ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an acquisition method and system of a sample picture data set. The method comprises the following steps: executing the cleaning process of picture data to be cleaned: acquiringa positive sample picture set and a negative sample picture set, wherein the feature information of picture data in the positive sample picture set is the same as that of a target picture, and the feature information of picture data in the negative sample picture set is different from of the target picture; performing training according to the positive and negative sample image sets to obtain a neural network sorter; classifying the picture data to be cleaned according to the neural network sorter to obtain a plurality of confidence sets; and executing an acquisition process of the sample picture data set: acquiring picture data in the confidence sets of which the confidence levels reach a preset level to obtain the sample picture data set. Through adoption of the acquisition method and system, automatic screening and classification are realized to obtain the sample picture data set, and the screening and classification efficiency and accuracy are increased.

Description

technical field [0001] The invention relates to the field of data processing, in particular to a method and system for acquiring sample picture data sets. Background technique [0002] As we all know, in the training of deep learning convolutional neural network, we need massive amounts of data. However, the data volume of a mature neural network can easily reach the terabyte level. Taking the convolutional neural network as an example, the input source of the data is generally a picture. A relatively large picture is about a few megabytes, and even a relatively small picture is usually a few megabytes. Hundreds of k, calculated according to the amount of terabytes of data, this will be a very large workload. [0003] The current industry practice is to use web crawlers to crawl a large amount of data, and then manually screen and classify all sample image data collections. The problem brought about by this processing method is that the workload is extremely huge, and the s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06K9/62G06N3/08
CPCG06N3/08G06F18/211G06F18/241
Inventor 罗培元
Owner SICHUAN FEIXUN INFORMATION TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products