Abnormal data point identification method in time sequence data on the basis of global information

A time series and abnormal data technology, applied in the field of data cleaning, can solve problems such as difficult to effectively and accurately judge the location of abnormal data points, affect the identification of abnormal data points, and the location of difficult abnormal points

Active Publication Date: 2018-11-02
EAST CHINA NORMAL UNIV
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, such methods only smooth all the data points in the sequence, and it is difficult to effectively judge the position of the abnormal point
[0004] In summary, when identifying sudden abnormal data points in time series data, it is difficult to effectively and accurately determine the location of abnormal data points, which affects the identification of abnormal data points

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Abnormal data point identification method in time sequence data on the basis of global information
  • Abnormal data point identification method in time sequence data on the basis of global information
  • Abnormal data point identification method in time sequence data on the basis of global information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0016] Such as figure 1 As shown, the present invention is based on the identification method of abnormal data points in time series data of global information, and its specific steps include:

[0017] S101: Obtain original time series data:

[0018] Get the original time series data x={1 ,x 1 >,2 ,x 2 >,...,n ,x n >}, where, t i Indicates the observation time of the i-th data point, x i Indicates the observed value of the i-th data point, i=1, 2, 3...n. Table 1 is the original time series data table in this embodiment.

[0019]

[0020]

[0021] Table 1

[0022] S102: Calculate the rate of change of the observed value of each data point and the rate of change of the speed of the data point

[0023] According to the observed value of each data point in the original time series data, calculate the observed value change speed of each data point, and calculate the speed change rate of each data point according to the observed value change speed of each data point: ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an abnormal data point identification method in time sequence data on the basis of global information. The method comprises the following steps that: obtaining original time sequence data and the observation value of each data point; calculating the observation value change speed of each data point and the speed change rate of each data point; according to the speed changerate of each data point, calculating the average speed change rate of the original time sequence data, carrying out statistics on the discrete probability distribution of the speed change rate of eachdata point, and carrying out fitting on the discrete probability distribution to obtain a probability density function; and according to the value change speed constraint of the time sequence data and the speed change rate of the data point, detecting an abnormal data point. The average speed change rate of the time sequence data and the probability distribution of the speed change rate are takenas global information, the integral characteristics of the time sequence data are fully reflected, and the data point which violates the integral characteristics can be searched to effectively identify the abnormal data point so as to identify the abnormal data point. By use of the method, the unexpected abnormal data points in the time sequence data can be accurately identified.

Description

technical field [0001] The invention belongs to the field of data cleaning, and more specifically relates to a method for identifying abnormal data points in time series data based on global information. Background technique [0002] With the development of information technology, data is generated and used all the time. Data from all walks of life is constantly growing, and data has become a very important role in people's lives. Due to the widespread use of various sensors, more and more time-series data are collected and applied in daily life, such as air temperature data and GPS trajectory data. Because these data contain rich information, time series data mining has become a hot research topic today. However, at the same time, dirty data also widely exists in time series data, and low-quality time series data has brought a huge impact on data mining and analysis. There is no doubt that by cleaning the time series data, thereby improving the data quality of the time s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
Inventor 王晓玲刘小捷宋光旋
Owner EAST CHINA NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products