Active learning big data mark method and system

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An active learning, big data technology, applied in the field of big machine learning, can solve the problem of low accuracy of big data anchor point labeling

Active Publication Date: 2016-11-30

广州图普网络科技有限公司

View PDF3 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] Based on this, it is necessary to provide an active learning big data labeling method and system for the low accuracy of big data anchor point labeling in the prior art

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment approach

[0055] As a specific implementation, the active learning big data labeling method also includes the following steps:

[0056] Use the kernel matrix K to perform nonlinear mapping on the data points, and obtain the distance after nonlinear mapping

[0057] Using a greedy sequential method, the anchor data set used for active learning is determined according to the following formula:

[0058] z t ∈X and

[0059] Among them, Z t-1 ={z 1 ,…,z t-1} is assumed to have determined t-1 anchor points, z i =x p(i) , p represents the subscript correspondence, Indicates that the t-th anchor point is determined according to the formula,

[0060]

[0061] Initialize Z=φ, according to t=1,...,m sequentially calculate coefficient, keep unchanged, calculated as well as Update according to the proximal point method Sure to make get the minimum yes, and where Tr(·) represents the trace of the matrix, represents the pth of the kernel matrix K i Row.

[006...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The present invention relates to an active learning big data mark method and system. The method comprise: performing linearity reconstruction of each data point according to the anchor data set to be marked in the data set to be marked; calculating the distance between data points; taking the distances as the weight construction regular items of reconstruction parameters, wherein the distances are inversely proportional to the reconstruction parameters; constructing and obtaining a data mark model to perform corresponding processing and correction of the data mark model; and performing optimizing and solution to determine the anchor data for active learning. Because the distances are inversely proportional to the reconstruction parameters, the data mark model is sensitive to the distance among data points, and it is easier to determine whether the corresponding data points have representativeness or not in the solution and optimization process according to the size of the infinite norm value to accurately screen out the anchor data set for active learning in the data set to be marked so as to improve the big data anchor mark accuracy.

Description

technical field [0001] The invention relates to the technical field of big machine learning, in particular to an active learning big data labeling method and system. Background technique [0002] With the advent of the era of big data, especially the development of Internet technology, machine learning applications are faced with an increasing amount of data. Traditional supervised learning methods have better results than semi-supervised learning methods, but the application of supervised learning methods often requires a large amount of labeled data to achieve better results, although the advent of the era of big data makes machine learning tasks can be easily obtained A large amount of data, but to obtain accurately labeled data still requires a lot of manpower and material resources. Active learning technology in the field of big machine learning technology can realize the selection of the most valuable data from massive unlabeled samples for labeling, which can greatly...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G06N5/02G06N99/00

CPCG06N5/025G06N20/00

Inventor 李明强

Owner 广州图普网络科技有限公司

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Active learning big data mark method and system

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment approach

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology