Identification method of abnormal data points in time series data based on global information

A time series and abnormal data technology, applied in the field of data cleaning, can solve problems that affect the identification of abnormal data points, it is difficult to effectively and accurately judge the location of abnormal data points, and the location of difficult abnormal points

Active Publication Date: 2021-09-14
EAST CHINA NORMAL UNIV
View PDF9 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, such methods only smooth all the data points in the sequence, and it is difficult to effectively judge the position of the abnormal point
[0004] In summary, when identifying sudden abnormal data points in time series data, it is difficult to effectively and accurately determine the location of abnormal data points, which affects the identification of abnormal data points

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Identification method of abnormal data points in time series data based on global information
  • Identification method of abnormal data points in time series data based on global information
  • Identification method of abnormal data points in time series data based on global information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0016] Such as figure 1 As shown, the present invention is based on the identification method of abnormal data points in time series data of global information, and its specific steps include:

[0017] S101: Obtain original time series data:

[0018] Get the original time series data x={1 ,x 1 >,2 ,x 2 >,...,n ,x n >}, where, t i Indicates the observation time of the i-th data point, x i Indicates the observed value of the i-th data point, i=1, 2, 3...n. Table 1 is the original time series data table in this embodiment.

[0019]

[0020]

[0021] Table 1

[0022] S102: Calculate the rate of change of the observed value of each data point and the rate of change of the speed of the data point

[0023] According to the observed value of each data point in the original time series data, calculate the observed value change speed of each data point, and calculate the speed change rate of each data point according to the observed value change speed of each data point: ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for identifying abnormal data points in time series data based on global information, comprising the steps of: acquiring original time series data and observation values ​​of each data point; calculating the change speed of observation values ​​of each data point and the speed of each data point Rate of change: According to the rate of change of speed of each data point, calculate the average rate of change of speed of the original time series data, count the discrete probability distribution of the rate of change of speed of each data point, and fit the discrete probability distribution to obtain the probability density function; according to Value change velocity constraints for time series data and velocity change rates of data points to detect anomalous data points. The average speed change rate and the probability distribution of the speed change rate of time series data are used as global information, which fully reflects the overall characteristics of time series data. By looking for data points that violate the overall characteristics, abnormal data points can be effectively identified and anomalies can be identified. data point. This method can accurately identify sudden abnormal data points in time series data.

Description

technical field [0001] The invention belongs to the field of data cleaning, and more specifically relates to a method for identifying abnormal data points in time series data based on global information. Background technique [0002] With the development of information technology, data is generated and used all the time. Data from all walks of life is constantly growing, and data has become a very important role in people's lives. Due to the widespread use of various sensors, more and more time-series data are collected and applied in daily life, such as air temperature data and GPS trajectory data. Because these data contain rich information, time series data mining has become a hot research topic today. However, at the same time, dirty data also widely exists in time series data, and low-quality time series data has brought a huge impact on data mining and analysis. There is no doubt that by cleaning the time series data, thereby improving the data quality of the time s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/215G06F16/2458
Inventor 王晓玲刘小捷宋光旋
Owner EAST CHINA NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products