Inverse mahalanobis distance measuring method based on weighting Moore-Penrose in process of data mining

A Mahalanobis distance and data mining technology, applied in electrical digital data processing, special data processing applications, instruments, etc., to solve problems such as poor correlation data stability, inability to completely maintain the mean and variance of data sources, and poor reliability.

Inactive Publication Date: 2011-03-09
ZHEJIANG UNIV OF TECH
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0029] In order to overcome the disadvantages of existing distance measurement methods in the data mining process, which are affected by dimensions, cannot completely maintain the mean and variance of data sources, and have poor stability and reliability when processing correlation data, the present invention provides a method that does not Affected by dimension (with linear transformation invariance), maintaining data mean and variance information, and ensuring normal progress and higher performance when processing any correlation data, the horse based on weighted Moore-Penrose inverse in the data mining process method of measuring distance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Inverse mahalanobis distance measuring method based on weighting Moore-Penrose in process of data mining
  • Inverse mahalanobis distance measuring method based on weighting Moore-Penrose in process of data mining
  • Inverse mahalanobis distance measuring method based on weighting Moore-Penrose in process of data mining

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0134]

[0135] The present invention will be further described below.

[0136] A Mahalanobis distance measurement method based on weighted Moore-Penrose inverse in the process of data mining, setting a is a vector or matrix, then a T express a the transposition of

[0137] Let X 1 ,X 2 ,...,X m are m data individuals, where X i =(x i1 ,x i2 ,...,x in ), i=1,2,...,m, n is the data X i the number of attributes. Then the overall data can be expressed as X=(X 1 ,X 2 ,...,X m ) T ,which is:

[0138]

[0139] The assay method comprises the following steps:

[0140] 1) Calculate the covariance matrix of the data population Z,

[0141] in, , S For n??n matrix.

[0142] 2) Expand the covariance matrix according to the spectral decomposition theory of real symmetric matrices, we can get

[0143]

[0144] Among them, λ i for S The ith eigenvalue of , e i For the corresponding n-dimensional normalized feature vector (column vector) i= 1,2,…,n , and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to an inverse mahalanobis distance measuring method based on weighting Moore-Penrose in the process of data mining, comprising the following steps: 1) calculating a covariance matrix S of a data totality X; 2) based on the spectrum decomposition theory of a real time symmetric matrix, expanding the covariance matrix S of a data totality X; 3) constituting a weight matrix M, and a weight matrix N by the following concrete process: 1. constituting an n n matrix M; and 2. constituting an n n matrix N; 4) calculating a weight Moore-Penrose inverse matrix of the covariance matrix S; and 5) calculating the mahalanobis distance between a data individual Xi and a data individual Xj. The invention provides an inverse mahalanobis distance measuring method based on weighting Moore-Penrose in the process of data mining, which is free from the influence of dimension (with invariance of linear conversion), maintains data mean value and variance information and ensures normal operation with higher performance no matter what relevance data are processed.

Description

technical field [0001] The invention relates to the technical field of data mining process, in particular to a WMP Mahalanobis distance measurement method for processing limited correlation data sets. Background technique [0002] With the continuous accumulation of business data of enterprises or industries, massive data sets have been formed. If relying solely on manual sorting or understanding of such a huge data source, there are already problems such as efficiency and accuracy. Therefore, more and more enterprises are using data mining technology to solve the problems of collation of massive data and knowledge discovery, and provide support for enterprise decision-making. Data preprocessing accounts for about 60%-70% of the workload of the entire data mining process, and plays a vital role in the results of data mining. An important step in data preprocessing is to fill in the missing data in the original data. In the process of complementing the missing value, the d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
Inventor 黄德才陈欢陆亿红沈雯燕
Owner ZHEJIANG UNIV OF TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products