Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A time series data cleaning method and system

A technology of time series data and data, applied in the field of data processing

Active Publication Date: 2018-01-16
BEIJING TECHNOLOGY AND BUSINESS UNIVERSITY
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0014] The technical problem to be solved by the present invention is to provide a method for removing wild points and high-frequency noise in the original data for the next step, aiming at the shortcomings of the current method that can only process the default value, wild points and noise data separately. For data analysis, time series data cleaning method and system based on Kalman filter and statistical average

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A time series data cleaning method and system
  • A time series data cleaning method and system
  • A time series data cleaning method and system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] The principles and features of the present invention are described below in conjunction with the accompanying drawings, and the examples given are only used to explain the present invention, and are not intended to limit the scope of the present invention.

[0066] Such as figure 1 As shown, it is a time series data cleaning method according to the present invention, which specifically includes the following steps:

[0067] Step 1: collect a piece of raw data, the raw data includes a plurality of original time series data;

[0068] Step 2: Perform irregular random sampling on the original time series data to obtain multiple non-equal sampling interval data;

[0069] Step 3: Estimate all non-equal sampling interval data to obtain multiple estimated data;

[0070] Step 4: Completing the gaps in the multiple pieces of estimated data due to random sampling, and obtaining multiple pieces of filled estimated data containing multiple point data;

[0071] Step 5: Classify al...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention relates to a time-series data cleaning method and system, wherein the method includes step 1: collecting a piece of original data, which includes a plurality of original time-series data; step 2: randomly sampling and estimating the original time-series data to obtain a plurality of Estimated data, fill in the gaps and defects generated by random sampling, and obtain multiple pieces of completed estimated data; Step 3: Classify all completed estimated data according to sampling time points, obtain multiple groups of time-classified data, and classify each group of time The data is sorted according to size to obtain multiple sets of sorted arrays; step 4: process each set of sorted arrays to obtain a corresponding average value data, multiple sets of sorted arrays correspond to multiple average values, and multiple average values ​​​​constitute a mean value sequence; step 5: Output the mean value sequence, the mean value sequence is the data from which outliers and high-frequency noise are removed. All-in-one data cleaning to handle missing values, remove wild points and smooth noisy data.

Description

technical field [0001] The invention relates to a time series data cleaning method and system, belonging to the technical field of data processing. Background technique [0002] Data analysis is currently a hot topic in the field of artificial intelligence and database research. The first step in the data analysis process is data preprocessing. Data preprocessing can effectively improve data quality and provide more targeted available data for the data mining core. Not only can save a lot of time and space, but also the mining results can play a better role in decision-making and prediction. [0003] According to statistics, in the entire process of time series data analysis, data preprocessing accounts for 60% of the entire workload, which shows its importance. The reason is that the original time series data often has certain wild points and high-frequency noise, because the data in the real world is often incomplete, noisy and inconsistent, and the real data trend is los...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 金学波窦超
Owner BEIJING TECHNOLOGY AND BUSINESS UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products