Time sequence data stream clustering method based on wavelet attenuation synopsis tree

A data stream clustering and time series technology, which is used in text database clustering/classification, electronic digital data processing, special data processing applications, etc.

Inactive Publication Date: 2017-10-24
ZHEJIANG GONGSHANG UNIVERSITY
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In order to overcome the deficiencies of existing methods in storing and mining time series data streams with various characteristics such as dynamicity, nonlinearity, high dimensionality, complexity, and redundancy, the present invention provides a method that can reflect the attenuation characteristics of data streams Wavelet Summary Structure and Efficient Dataflow Knowledge Mining A Time Series Dataflow Clustering Method Based on Wavelet Decay Summary Tree

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Time sequence data stream clustering method based on wavelet attenuation synopsis tree
  • Time sequence data stream clustering method based on wavelet attenuation synopsis tree
  • Time sequence data stream clustering method based on wavelet attenuation synopsis tree

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0040] The present invention will be further described below in conjunction with the accompanying drawings.

[0041] A time series data stream clustering method based on wavelet decay summary tree, comprising the following steps:

[0042] Step 1, refer to figure 1 , the specific construction of the tree-like attenuation profile based on wavelet transform, including the following steps:

[0043] (11) Compressed data node threshold filtering. Assuming that the time series is stable, the incoming data in the time series is regarded as the first layer, and the data series arriving at the same time are composed of n data on average, then these n data form a data node, the first layer The number of data nodes has Where M is the total number of data in the time series.

[0044] (12) Data preprocessing. Noise processing is performed on real-time time series, mainly to perform vacancy value processing on data series. Suppose the number of attributes of the data sequence is M, if...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a time sequence data stream clustering method based on a wavelet attenuation synopsis tree. Firstly, time sequence attenuation characteristics are introduced into a wavelet synopsis structure, a multi-dimensional time sequence tree-like attenuation synopsis construction method based on wavelet transformation is provided, a good approximation of an original sequence can be reconstructed by reserving r<n most important wavelet coefficients, the influence of 'dimension disasters' is relieved, the synopsis structure is constructed, and a time sequence is approximately represented. On the basis, similar characteristics of data stream are rapidly extracted based on the synopsis structure, the approximate distance between the data stream and a clustering center is calculated, and the K-means clustering method is suitable for the multi-dimensional time sequence data stream and solves the problem that a traditional clustering method cannot be directly applied to the data stream with infinite length, evolution with time and large data volume.

Description

technical field [0001] The invention relates to the technical field of data flow mining, in particular to a time series data flow clustering method based on a wavelet attenuation summary tree. It is especially suitable for the large number of users and resources in smart business and smart financial applications. It stores multi-dimensional time series data streams, compresses and clusters them through cloud computing and electronic tags. technical background [0002] With the rapid development of Internet of Things technology, cloud computing, mobile Internet technology, electronic label technology, etc., based on the increasing demand for technology in the macro environment of the retail industry, data stream mining technology has become an important means to reduce operating costs and improve operational efficiency. However, due to the continuous generation of a large amount of sequential data that evolves over time in smart retail, a time-series data flow containing info...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/30
CPCG06F16/2474G06F16/2465G06F16/35
Inventor 肖亮郭飞鹏
Owner ZHEJIANG GONGSHANG UNIVERSITY
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products