Time series classification method and device based on characteristic sampling

A time series and classification method technology, applied in the field of data processing, can solve the problems of algorithm generalization and accuracy that cannot meet the time series classification problem, cannot deal with local similarity problems, and the performance of a single model is weak, etc., to achieve Good effect, high classification accuracy and strong generalization

Active Publication Date: 2018-09-25
HARBIN INST OF TECH
View PDF7 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Although there are many algorithms for time series classification, these algorithms cannot meet the needs of time series classification problems in terms of generalization and accuracy, which is mainly reflected in the following aspects:
[0005] 1. Many time series data conversion and classification algorithms currently proposed are more effective for small-scale time series data. For large-scale time series data, due to limitations in memory, processing time and other conditions, they are not applicable
[0006] 2. Time series data has complex attributes such as local similarities and dependencies, and current time series processing algorithms cannot deal with local similarity problems
[0007] 3. Most of the current time classification algorithms are based on a single linear model or tree model. The performance of a single model is weak, so the accuracy is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Time series classification method and device based on characteristic sampling
  • Time series classification method and device based on characteristic sampling
  • Time series classification method and device based on characteristic sampling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0052] like figure 1 As shown, the time series classification method based on feature sampling provided by the embodiment of the present invention may include the following steps:

[0053] Step S101: converting the time series data set used for training into a training data set with equal-length features by using a feature sampling method, and converting the time series data set used for testing into a test data set with equal-length features;

[0054] Step S102: using the ensemble learning classification method, using the training data set with equal-length features to perform model training;

[0055] Step S103: Use the trained model to perform time series classification on the test data set with features of equal length.

[0056] The present invention mainly aims at the feature selection problem of the time series problem. Given a sample set of time series, each sample contains a time series X i ={x(t)|t∈{1,2,,T}} and the classification label corresponding to the time ser...

Embodiment 2

[0060] On the basis of the time series classification method based on feature sampling provided in Embodiment 1, the feature sampling method in step S101 is a segmented feature sampling method, which can be specifically implemented in the following ways:

[0061] First, set the segment length l 1 , the number of segments m 1 and interval between segments g 1 ;

[0062] Subsequently, for each piece of time series data in the time series dataset, such as Figure 2a As shown, sampling is performed from the start position of the time series data, and continuous l is selected 1 time series data as the first piece of feature data, and the starting position of each piece of sampling after that is the starting position of the previous piece of sampling plus the interval between segments g 1 , select l for each segment 1 sequence data, after sampling, convert each piece of time series data into m 1 feature data (that is, sub-time series T1, T2, T3...). Preferably, l 1 =5,g 1 =...

Embodiment 3

[0064] On the basis of the time series classification method based on feature sampling provided in Embodiment 2, the ensemble learning classification method adopted in step S102 is a random forest classifier based on bagging.

[0065] After the sampling method is determined, the selection and design of the classifier is also an important research content of the present invention. The current mainstream classification methods include basic classification methods such as decision trees and logistic regression, as well as integrated learning methods such as XGboost and gradient boosting trees. The realization of algorithms such as decision tree and logistic regression is easy to write, but the effect is worse than the integrated learning classification method on the accuracy rate, so in the selection of the classifier, the present invention adopts the integrated learning classification method, combined with the experimental results, the random forest classifier Works best. The m...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the technical field of data processing and provides a time series classification method and device based on characteristic sampling. The method comprises the following steps:converting time series data sets for training into training data sets with an equal-length characteristic with a characteristic sampling method, and converting the time series data sets for testing into test data sets with an equal-length characteristic; performing model training by using the training data sets with the equal-length characteristic with an integrated learning classification method;performing time series classification on the test data sets with the equal-length characteristic by the trained model. The time series data sets with different lengths are first converted into the data sets with the equal-length characteristic with the characteristic sampling method, and then classification is performed with the integrated learning classification method, so that accuracy of timeseries classification is improved, and large-scale time series data can be accurately classified.

Description

technical field [0001] The present invention relates to the technical field of data processing, in particular to a time series classification method and device based on feature sampling. Background technique [0002] Time series classification has a wide range of applications in many fields, such as the application of Hidden Markov Model (HMM) and Dynamic Time Warping (DTW) in speech processing and speech recognition. In the direction of the database, we call the database composed of values ​​that change over time a time-series database, and the data mining work in this time-series database is called time-series mining. The time series classification problem is very important for time series mining. Compared with ordinary and conventional classification data, time series has the characteristics of variable length, strong front and back dependence of sequence data, and more noise data. [0003] Due to the special properties of the data sequence, the time series classificatio...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 王宏志孟凡山齐志鑫高宏
Owner HARBIN INST OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products