A Method for Removing Redundancy of Information from Sample Sets

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A sample set and de-redundancy technology, which is applied in the field of information de-redundancy for sample sets, and can solve problems such as information de-redundancy of large-scale data sets.

Active Publication Date: 2021-09-14

ZHEJIANG UNIV

View PDF4 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

Considering the large scale of data sets, the complex relationship between samples, and the comparison and analysis of samples based on pairs relies on huge computing power, there is currently no technical solution that can be directly used to deal with the information redundancy of large-scale data sets

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0034] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0035] First of all, DNN (Deep Neural Network, deep neural network) of the present invention: it is a kind of multi-layer feed-forward artificial neural network, its neurons can respond to the surrounding units in the preset coverage range, and can share weights and feature aggregation, Effectively extract feature information of samples.

[0036] Teacher-Student (Teacher Student Model, teacher-student model): It is a neural network structure based on distillat...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The present invention provides an information de-redundancy method for a sample set. The method includes: obtaining samples to be processed and corresponding trainable labels to obtain the original sample set to be processed; Feature extraction, to obtain the feature vector set of the original sample set; input the feature vector set to the learnable sample selector model, perform sample selection on the feature vector set, and obtain a representative feature vector subset according to the preset threshold; obtain the feature vector The original sample corresponding to the quantum set is used as a sub-sample set after removing redundant information. The technical solution of the present invention can efficiently streamline the original sample set, remove redundant information and retain valuable information samples, and improve the training efficiency of the algorithm on the sample set.

Description

technical field [0001] The invention relates to the technical field of data processing, in particular to an information de-redundancy method for a sample set. Background technique [0002] With the development of deep learning technology, machine learning methods based on large-scale data sets have been continuously proposed. However, in reality, large-scale data sets often have a large amount of redundant information and data, for example, excessive single-category samples, repeated or approximate samples, etc.; on the other hand, large-scale data sets make the training process of machine learning models more complex. A lot of computing power and computing time consume a lot of resources. Therefore, in the face of large-scale training tasks in different scenarios, for example, ultra-large-scale computational vision classification tasks often use tens of millions of image samples for training, or ultra-large-scale natural language processing tasks often use hundreds of mill...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G06K9/62G06N3/04G06N3/08G06N20/20

CPCG06N20/20G06N3/084G06N3/045G06F18/2415G06F18/214

Inventor 程战战许昀璐吴飞浦世亮

Owner ZHEJIANG UNIV

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A Method for Removing Redundancy of Information from Sample Sets

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology