Method and equipment for updating training data set

A technology for training data sets and training data, applied in image data processing, neural learning methods, character and pattern recognition, etc., can solve problems such as unbalanced categories and samples, and achieve low-cost effects

Pending Publication Date: 2022-02-08
SHENGDOUSHI SHANGHAI SCI & TECH DEV CO LTD
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the current methods have their own defects, and cannot solve the problem of unbalanced class samples well in any case.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and equipment for updating training data set
  • Method and equipment for updating training data set
  • Method and equipment for updating training data set

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] Exemplary embodiments of the present application will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many forms and should not be construed as limited to the embodiments set forth herein; rather, these The concepts are fully conveyed to those skilled in the art. In the drawings, the size of some elements may be exaggerated or deformed for clarity. The same reference numerals in the drawings denote the same or similar structures, and thus their detailed descriptions will be omitted.

[0018] Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided in order to give a thorough understanding of the embodiments of the present application. However, those skilled in the art will appreciate that the technical solutions of the present application may be prac...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a method for updating a training data set. The method comprises the following steps: acquiring an initial training data set and an unlabeled data set; performing data enhancement on the training data and/or screening the unlabeled data according to preset key information to obtain first expanded data, wherein the first expanded data comprises category labeling information meeting a preset category condition; training a classification model according to the first extended data and the initial training data set; predicting the category of the to-be-predicted data in the unlabeled data set by using the trained classification model, and determining the to-be-predicted data of which the predicted category meets a preset category condition and the data attribute meets a preset attribute condition as second expanded data, where the to-be-predicted data comprises unlabeled data except the first extended data in the unlabeled data set; and according to the first extended data and/or the second extended data, updating the initial training data set to obtain an updated training data set.

Description

technical field [0001] The present application relates to data preprocessing, in particular to a method and device for updating a training data set, especially for expanding training data of categories with insufficient samples in the training data set. Background technique [0002] In recent years, instead of manual classification in the past, more and more algorithm-based classification models are used in the business processes of enterprises to automatically classify business data, so that the subsequent business departments or business personnel corresponding to the types of business data can classify It is processed. Here, the more accurate the parameters of the classification model are trained, the more accurate the prediction result of the model will be, and the better the classification effect will be. Therefore, the training data used to train the classification model is very important. If too little training data is used to model a classification model or to tune...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06V10/774G06V10/764G06V10/82G06K9/62G06T3/40G06N3/04G06N3/08
CPCG06T3/4007G06N3/04G06N3/08G06F18/241G06F18/214
Inventor 凌悦
Owner SHENGDOUSHI SHANGHAI SCI & TECH DEV CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products