Real-time clustering method for evolution data stream

A clustering method and data flow technology, applied in structured data retrieval, database models, relational databases, etc., can solve problems such as limiting the application range of data flow clustering algorithms, improve stability, improve processing capabilities, and expand applications range effect

Inactive Publication Date: 2018-07-24
NAT UNIV OF DEFENSE TECH
View PDF0 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although scholars at home and abroad have made many attempts on data stream clustering technology for evolutionary data streams, they mainly focus on the evolution of new classes, which severely limits the application range of data stream clustering algorithms. Therefore, there are Necessary to extend the ability of data stream clustering techniques to handle multiple evolutionary forms of data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Real-time clustering method for evolution data stream
  • Real-time clustering method for evolution data stream
  • Real-time clustering method for evolution data stream

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0033] The present invention will be further described below in conjunction with the accompanying drawings.

[0034] figure 1 It is a schematic flow diagram of the principle of the present invention. Firstly, the steps of setting up three types of sets (i.e. effective set, disappearance set and outlier set) are carried out, then the steps of dividing the points to be processed into sets are carried out, and finally the steps of three sets of sets are carried out. renew. Among them, the basic elements in the effective class set and the vanishing class set are categories; the basic elements in the outlier set are processing points, that is, the basic units that form data streams, also called data points. The effective class set stores the categories that still have clustering significance for the data flow at the current moment, and its initial value is the category obtained by using the static clustering method after collecting a certain amount of data to be processed; It is ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an online clustering method for an evolution data stream. According to the technical scheme, the method comprises the following steps that 1, a valid-type set, a vanishing-typeset and a separation-point set are established; 2, a to-be-processed point obtained at the current moment is classified to a certain set; 3, the separation-point set, the valid-type set and the vanishing-type set are updated. According to the online clustering method, for three kinds of typical evolution modes, namely emergence, vanishing and re-emergence of a type, in the evolution data stream, detection functions are designed respectively, integrated and unified, the stability of the clustering method for the data stream is improved, and the application range of the clustering method for thedata stream is enlarged.

Description

technical field [0001] The invention belongs to the technical field of data stream clustering, and in particular relates to a dynamic clustering method for evolutionary data streams. Background technique [0002] Data flow refers to real-time incoming data, which is different from traditional batch-acquired data. It is usually divided into static data flow (data distribution does not change) and evolutionary data flow (data distribution changes) according to whether the data distribution changes. Evolutionary data flow Streams are also known as dynamic data streams. At present, data flow has become one of the main data forms in the information society, such as financial transaction data, communication record data, sensor observation data, etc. Data stream clustering technology refers to the analysis of data streams through some clustering means. It has become one of the main means of data stream mining due to its strong advantage of not relying on prior information. [000...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/285G06F16/24568
Inventor 隋金坪刘振黎湘
Owner NAT UNIV OF DEFENSE TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products