Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Time Series Data Completion Method Based on Distance Matrix

A technology of time series and distance matrix, which is applied in the direction of electrical digital data processing, digital data information retrieval, special data processing applications, etc., can solve problems such as poor interpretability, insufficient adaptability, poor effect, etc., and achieve a good complementary effect , strong interpretability and clear physical meaning

Active Publication Date: 2022-04-22
NANJING UNIV
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The advantage of this method is that it is simple and efficient, but the disadvantage is that it ignores the correlation between multidimensional data, and the effect is not good in the case of a large number of missing and continuous data; 2) Non-negative matrix decomposition method: when there are multiple time series with the same mode Under the premise of , assuming that each sequence can be expressed as a linear combination of a set of basis vectors, use the non-negative matrix decomposition method to find the basis vector and the combination coefficient corresponding to each sequence through the known data, and restore it by multiplying the coefficient by the basis vector complete time series
The advantage of this method is that multiple pieces of information are fully considered, and the disadvantage is that it is poorly interpretable and cannot be modeled explicitly for various intrinsic physical laws; 3) Completion based on the hidden Markov model: assuming that the time series is a sequence of observations , which hides a real state sequence behind it, using the real state sequence to model the internal physical laws, and expressing the relationship between the state and the value in the time series through the mapping of the state to the observation, and decoding the hidden state sequence corresponding to the missing part , to fill in the missing data
The advantage is that it can display the physical laws of modeling including time smoothness, and the disadvantage is that it is not suitable for more complex spatial correlations.
In summary, the existing related methods do not fully consider the nature of time series data itself, and the discussion of time characteristics is limited to time smoothness

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Time Series Data Completion Method Based on Distance Matrix
  • A Time Series Data Completion Method Based on Distance Matrix
  • A Time Series Data Completion Method Based on Distance Matrix

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] Embodiments of the present invention will be described in further detail below in conjunction with the accompanying drawings.

[0030] Through the analysis and research of the existing time series data sets, the present invention finds that the time series data not only contains the simple property of time smoothness, but also has a more complex high-order time correlation relationship - cross-time similarity and cycle The data will show similar and cyclical characteristics in a certain time span or multiple time spans. For example, for the above-mentioned scenario of continuous recording of user activities based on smartphones, the user's activity data takes one week as the A cycle, weekly data is periodic; a day as a cycle, daily data is periodic. However, in many complex and lack of prior knowledge scenarios, it is very difficult to artificially mine all the periodicity behind time series data.

[0031] Below in conjunction with specific embodiment, further illustra...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a time series data completion method based on distance matrix, which mines and utilizes the inherent high-order time correlation of time series data to complete missing data with similar data points in the time series data; the method specifically Including: for the time series data, the distance matrix D of the time series is modeled based on a certain distance metric function, wherein the matrix element Dij located in the i-th row and the j-th column is the i-th data point and the j-th data point in the time series. The distance between data points; based on the obtained distance matrix D, find the k segments with the closest distance to the missing segment in the original time series; use the calculated k nearest neighbor segments to complete the missing segment data. This method can achieve a good completion effect in real time series data missing scenarios. At the same time, this method is highly interpretable and the physical meaning behind it is clearer, so it can be expanded on the basis of this method. , so that it can be effectively used in various real scenarios.

Description

technical field [0001] The invention belongs to the field of computer applications, in particular to an efficient data completion method for data loss caused by equipment performance limitations, network transmission errors, user privacy protection and other reasons in time series data collection and transmission, specifically a A Time Series Data Completion Method Based on Distance Matrix. Background technique [0002] Time series data is a collection of observation data obtained by observing in chronological order, and its properties mainly include large amount of data, high dimensionality and need to be continuously updated. Time-series data is ubiquitous in many different kinds of applications, such as: behavior capture, sensor networks, weather forecasting, financial market modeling, and more. The main purpose of analyzing time series is to identify the underlying patterns behind the data in order to predict future trends. There are many existing mathematical tools fo...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/215G06K9/62
CPCG06F16/215G06F18/24143
Inventor 汪亮吴思萌陶先平吕建
Owner NANJING UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products