Missing data recovery method and device

A technology for missing data and recovery methods, applied in the field of data processing, can solve problems such as inability to perform principal component analysis, large reconstruction, and inaccurate principal components

Active Publication Date: 2018-03-23
BEIJING GOLDWIND SCI & CREATION WINDPOWER EQUIP CO LTD
View PDF3 Cites 10 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, most of the current data compression algorithms based on principal component analysis need to pre-select batch data for principal component analysis. When the newly generated data cannot be well reconstructed by the current principal component, the update of the principal component is required.
[0004] That is to say, in the case of incomplete data du

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Missing data recovery method and device
  • Missing data recovery method and device
  • Missing data recovery method and device

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach 1

[0020] In this embodiment, it is assumed that missing data is included in multiple sets of data.

[0021] figure 1 A flowchart showing a method for recovering missing data according to Embodiment 1 of the present invention.

[0022] refer to figure 1 , firstly in step S110, multiple sets of data are acquired and combined into a corresponding numerical matrix. Specifically, multiple sets of data are acquired from a data source. In one embodiment, the data source is one or more monitoring devices, that is, in this step, multiple sets of monitoring data are obtained in time sequence from one or more monitoring devices as the multiple sets of data.

[0023] For example, assuming that multiple groups of data are the SCADA (Supervisory Control And Data Acquisition, data acquisition and monitoring control) data shown in the following table 1, then in this step, obtain the data in time sequence from a plurality of sensors as monitoring equipment Multiple sets of data are composed ...

Embodiment approach 2

[0056] In this embodiment, not only the missing data in the multiple sets of data is restored, but also data compression is performed on the multiple sets of data.

[0057] figure 2 A flow chart showing a method for recovering missing data according to Embodiment 2 of the present invention.

[0058] like figure 2 As shown, in this embodiment, in addition to the steps S110-S150 for realizing the recovery of missing data in the first embodiment, it also includes steps S260 and S270 for realizing data compression and decompression. Regarding steps S110-S150, no detailed description is given here.

[0059] In step S260, the multiple sets of data are compressed using the result of the probability matrix decomposition.

[0060] Specifically, based on the following formula (4), the result of the probability matrix decomposition in step S120 is combined with the second factor matrix V obtained in step S120 k Multiply to perform data dimensionality reduction compression:

[0061...

Embodiment approach 3

[0072] image 3 A block diagram of a device for recovering missing data according to Embodiment 3 of the present invention is shown.

[0073] like image 3 As shown, the missing data recovery device 300 of this embodiment includes: a data acquisition unit 310 , a probability matrix decomposition unit 320 , a missing location determination unit 330 , a missing data obtaining unit 340 and a data recovery unit 350 .

[0074] The data acquisition unit 310 acquires multiple sets of data and composes them into a corresponding numerical matrix. Specifically, the data acquisition unit 310 acquires multiple sets of data from a data source. In one embodiment, the data source is one or more monitoring devices, that is, the data acquisition unit 310 acquires multiple sets of monitoring data from one or more monitoring devices in time sequence as the multiple sets of data.

[0075] In addition, as required, the data acquisition unit 310 also performs preprocessing such as data type conv...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a missing data recovery method and device in order to achieve recovery of missing data. The missing data recovery method comprises the steps that multiple sets of data are obtained; a numerical value matrix composed of the multiple sets of data is subjected to probability matrix decomposition; the position of the data missed in the multiple sets of data is determined; the product of the elements, corresponding to the position of the data missed in the multiple sets of data, in a probability matrix decomposition result is obtained to be used as missing data; the obtainedmissing data is recovered to the position of the data missed in the multiple sets of data.

Description

technical field [0001] The present invention relates to the field of data processing, and more particularly, to a method and device for recovering missing data. Background technique [0002] In the field of data processing, data processing is generally required based on complete data. [0003] Taking data compression technology as an example, it is divided into two categories: lossless compression and lossy compression. The data compression algorithm based on PCA (Principle Components Analysis) is a lossy compression algorithm. Correlation is deredundant to achieve data dimensionality reduction and data compression. However, most of the current data compression algorithms based on principal component analysis need to pre-select batch data for principal component analysis. When the newly generated data cannot be well reconstructed by the current principal component, the principal component needs to be updated. [0004] That is to say, in the case of incomplete data due to d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F11/14
CPCG06F11/1402G06F11/1443
Inventor 张光磊刘源邱忠营
Owner BEIJING GOLDWIND SCI & CREATION WINDPOWER EQUIP CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products