High-efficiency SVM active half-supervision learning algorithm

A semi-supervised learning and active learning technology, applied in computing, computer components, instruments, etc., can solve problems such as lack of incremental learning ability, affecting active learning performance, and high complexity

Inactive Publication Date: 2015-01-28
AIR FORCE UNIV PLA
View PDF1 Cites 105 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, the active learning method based on error reduction needs to search the entire sample space before selecting samples. For a large amount of unlabeled sample sets, this sample selection strategy directly calculates the classifier on the test data set after adding samples. Classification error, the complexity of its calculation is quite high, it is not feasible in practice;
[0028] 2. Sensitive to label noise and unbalanced data distribution, etc., easy to sample repeated, similar, and meaningless samples
For example, active learning based on uncertainty sampling may sample isolated points, and it is difficult to distinguish samples with a large amount of information from abnormal samples;
[0029] 3. The influence of error propagation
That is, if the learner trained in the initial stage of active learning is inaccurate, the sample selected during the active learning process may not be the "most beneficial" sample for the learner training, which will affect the performance of active learning;
[0030] 4. Active learning does not have incremental learning ability
Active learning is an iterative process. Every time a new sample is added, the classifier must be retrained. However, the general active learning method does not have the ability to learn incrementally, and one of the problems faced by active learning is that repeated iterative operations bring High time complexity and space complexity;
[0031] 5. In active learning, there is a problem that more manual participation in feedback is required for the sampled samples
Although the relevant literature also discusses the view division method, such as the local feature set segmentation method based on the premise of independence, the segmentation method under the condition of a sufficiently large attribute set, and the view division based on 1-DNF, etc., however, in most practical applications These two conditions are difficult to satisfy, and it is even impossible to give a natural partition of the feature set

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • High-efficiency SVM active half-supervision learning algorithm
  • High-efficiency SVM active half-supervision learning algorithm
  • High-efficiency SVM active half-supervision learning algorithm

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0107] SVM active learning generally selects the most uncertain and low-confidence samples of the current learner to mark, and relatively certain or fully represented samples are not used for training, while semi-supervised learning methods can use these classifiers to mark them Relatively certain or high confidence samples to make full use of the useful information contained in unlabeled samples for classifier training, which can avoid the error propagation caused by the uncertainty of the initial classifier in SVM active learning, thereby improving SVM Active Learning Performance. Based on this, the present invention provides the SVM active semi-supervised learning algorithm that fuses semi-supervised learning and active learning, refer to below figure 1 , introduce in detail the process of the efficient SVM active semi-supervised learning algorithm of the present invention.

[0108] In this embodiment, the data uses the breast-cancer-wisconsin, ionosphere, house-votes-84, he...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a high-efficiency SVM active half-supervision learning algorithm. The algorithm comprises: (1), training an initial SVM classifier f<SVM><0>; (2), determining whether the f<SVM><0> satisfies a learning termination condition, and if not, skipping to step (3); (3), performing prediction marking on unmarked samples Us by use of the f<SVM><0>; (4), performing Tri-learning based half-supervision learning / QBC-based active learning on samples whose prediction mark confidence are greater than / smaller than a threshold in the Us, and adding the samples selected in the half-supervision learning / active learning to a marked training sample set; (5), training a f<SVM><k> on the updated marked training sample; and (6), repeating step (2) until the SVM classifier satisfies the termination condition of the active learning. The algorithm provided by the invention has the following advantages: during an SVM training learning process, according to the learning process, the samples which best facilitate classifier performance are autonomously selected for training the classifier, after these samples are added to the tainting set, the accuracy of classifying the unmarked samples through the semi-supervision learning is improved to the maximum degree, and the SVM classification precision is enhanced.

Description

technical field [0001] The invention relates to an algorithm, in particular to an efficient SVM active semi-supervised learning algorithm, and belongs to the technical field of machine learning algorithms. Background technique [0002] SVM (Support Vector Machines, Support Vector Machines) is a new pattern recognition method developed on the basis of the VC dimension theory of statistical learning theory and the principle of structural risk minimization. It can seek the best compromise between the complexity of the model (i.e., the learning accuracy for a specific training sample) and the learning ability (i.e., the ability to identify any sample without error) based on limited sample information, in order to obtain the best generalization ability. It largely solves the problems of model selection and over-learning, nonlinear and dimension disasters, local minimum points and other problems existing in traditional pattern recognition technology. Many unique advantages have ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06K9/62
CPCG06F18/2411
Inventor 徐海龙别晓峰龙光正冯卉吴天爱白东颖郭蓬松史向峰田野高歆
Owner AIR FORCE UNIV PLA
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products