Repair method of abnormal data points in time series data based on global information

A technology of abnormal data and time series, applied in the field of data cleaning, it can solve problems such as inability to find satisfaction, excessive repair, data point modification errors, etc., to achieve accurate repair, reduce the number of comparisons, and reduce time overhead.

Active Publication Date: 2021-03-30
EAST CHINA NORMAL UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, such methods may cause over-repair problems, that is, the original correct data points are modified incorrectly
Moreover, some new constraint-based methods have been proposed recently, but existing constraint-based methods cannot find the most probable outcome among all valid inpainting values ​​satisfying the constraints
The true value of anomalous data points is difficult to accurately estimate, making time series data cleaning an extremely challenging problem
[0004] In summary, at present, when repairing sudden abnormal data points in time series data, there are problems of over-repair or difficulty in accurately repairing abnormal data points

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Repair method of abnormal data points in time series data based on global information
  • Repair method of abnormal data points in time series data based on global information
  • Repair method of abnormal data points in time series data based on global information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0018] Such as figure 1 As shown, the present invention aims to solve the problem of excessive repair or difficulty in accurately repairing abnormal data points when repairing sudden abnormal data points in time series data. A method for repairing abnormal data points in time series data based on global information is proposed, which is characterized in that it includes the following steps:

[0019] S101: Obtain the original time series data and the location of abnormal data points;

[0020] Get the original time series data x={1 ,x 1 >,2 ,x 2 >,...,I ,x I >} and the location of the abnormal data point, t i Indicates the observation time of the i-th data point, x i Indicates the observed value of the i-th data point.

[0021] Table 1 is the original time series data and abnormal data point locations in this embodiment.

[0022]

[0023]

[0024] Table 1

[0025] S102: Initialize abnormal data points:

[0026] For each abnormal data point, k nearest neighbor algo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method for repairing abnormal data points in time series data on the basis of global information. The method comprises the following steps that: through obtained original time series data and an abnormal data point position, adopting the mean value of latest k pieces of abnormal data points to carry out abnormal data point initialization, i.e., preliminary repairing; gathering similar sub-series data in the same class cluster; searching the similar repaired sub-series data; taking the similar sub-series data in the time series data as global information; and finally,using the weighting accumulated value of mean values provided by multiple pieces of similar sub-series data as the value of the repaired abnormal data point so as to lower an influence brought by theerror of the single piece of similar sub-series data. Through the steps, unexpected abnormal data point in the time series data can be accurately repaired.

Description

technical field [0001] The invention belongs to the field of data cleaning, and more specifically relates to a method for repairing abnormal data points in time series data based on global information. Background technique [0002] With the widespread use of various sensors, more and more time-series data are collected and applied in daily life, such as temperature data and GPS trajectory data. As a hot research topic nowadays, time series data mining has great value. However, dirty data also widely exists in time series data, which has a huge impact on the mining and analysis of time series data. There is no doubt that repairing abnormal data points in time series data can improve data quality and effectively improve the results of data mining, which is of great significance. [0003] Among them, repairing sudden abnormal data points in time series data is an important content in data cleaning. Inaccurate or erroneous data points are often present in time series data due...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F11/07G06F16/215G06F16/28G06K9/62
CPCG06F11/0793
Inventor 王晓玲刘小捷宋光旋
Owner EAST CHINA NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products