Streaming data integration classification method and device based on concept drift
A technology of concept drift and classification method, applied in the field of flow data integration classification method and device based on concept drift, can solve problems such as frequent data flow, and achieve the effect of ensuring classification accuracy, solving concept drift, and coping with concept drift phenomenon
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0039] This embodiment discloses a flow data integration classification method based on concept drift, such as figure 1 described, including the following steps:
[0040] The embodiment adopts four datasets: SEA dataset, Converttype dataset, HyperPlane dataset, and Electricity dataset. The embodiments all adopt the large-scale data online analysis platform MOA. Use the data flow generator to simulate data flow, and process the data flow in blocks. Set a threshold for the data block. Within the threshold range, the arriving data samples are filled into the current data block, split into samples with class labels and samples without class labels, and those with class labels are redistributed to each category with for training base classifiers.
[0041] The data flow can be formalized as x 1 ,x 2 ,...x t-1 ,x t ,x t+1 ,(x t =(S 1 ,S 2 ,…S d ,Y)), t is the timestamp, d is the number of sample attributes, and s is the feature vector of the sample. Y is a class label, a...
Embodiment 2
[0070] The purpose of this embodiment is to provide a computing device.
[0071] A stream data integration classification device based on concept drift, comprising a memory, a processor, and a computer program stored in the memory and operable on the processor, when the processor executes the program, it realizes:
[0072] Obtain multiple data blocks including labeled and unlabeled sample data;
[0073] training a single-class base classifier for each class in a plurality of said data blocks according to the class label;
[0074] Constructing an integrated classification matrix according to the single-class base classifiers corresponding to a plurality of the data blocks;
[0075] When a new data block arrives, the integrated classification matrix is updated, and the class label is calculated for the unlabeled samples.
Embodiment 3
[0077] The purpose of this embodiment is to provide a computer-readable storage medium.
[0078] A computer-readable storage medium having stored thereon a computer program that, when executed by a processor, performs:
[0079] Obtain multiple data blocks including labeled and unlabeled sample data;
[0080] training a single-class base classifier for each class in a plurality of said data blocks according to the class label;
[0081] Constructing an integrated classification matrix according to the single-class base classifiers corresponding to a plurality of the data blocks;
[0082] When a new data block arrives, the integrated classification matrix is updated, and the class label is calculated for the unlabeled samples.
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com