Data processing method and device in data modeling

A data processing and data modeling technology, applied in the computer field, can solve problems such as reducing computer work efficiency, wasting computing resources, and long computing time.

Active Publication Date: 2020-07-07
HUAWEI TECH CO LTD
View PDF5 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0011] Embodiments of the present invention provide a data processing method and device in data modeling, which are used to solve the problem of long calculation time and large amount of calculation in the original data preprocessing flow in the prior art, which increases the operating load of the computer, wastes computing resources, and reduces computer productivity problems

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data processing method and device in data modeling
  • Data processing method and device in data modeling
  • Data processing method and device in data modeling

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0097]By adopting the data processing method in data modeling provided by the present invention, by identifying the corresponding data conversion function according to the preset data processing category, data conversion is performed on the data column corresponding to each feature in the read raw data to generate the corresponding Extend the feature column, combine the extended feature columns corresponding to all the features in the original data to generate an extended feature set; determine the correlation coefficient of each feature in the extended feature set; select the feature whose correlation coefficient meets the set condition as an important feature, in The extended feature set screens out the data columns corresponding to important features. By extending the features, the calculation amount of evaluating various data preprocessing methods is reduced, and the problems of long time consumption and large amount of calculation caused by data modeling through exhaustive...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a data processing method and device in data modeling, which are used to solve the problems in the prior art that the preprocessing flow of original data has a large amount of calculation, a long calculation time, a waste of calculation resources, and a reduction in work efficiency. . The method is: according to the data conversion function corresponding to the preset data processing category identification, perform data conversion on the data column corresponding to each feature in the read original data to generate a corresponding extended feature column, and convert all the features in the original data to The corresponding extended feature columns are combined to generate an extended feature set; the correlation coefficient of each feature in the extended feature set is determined; the feature whose correlation coefficient meets the set conditions is selected as an important feature, and the data column corresponding to the important feature is selected in the extended feature set . In this way, the problems of long time consumption and large calculation amount caused by data modeling through exhaustive data preprocessing methods are avoided, the calculation efficiency is improved, and the flexibility and adaptability of automatic data modeling are improved.

Description

technical field [0001] The invention relates to the field of computer technology, in particular to a data processing method and device in data modeling. Background technique [0002] Data mining is one of the steps in database knowledge discovery, which is to find hidden relationships from large amounts of data and extract valuable information. Usually, in data mining, methods and technologies in the fields of database technology, statistics, online analytical processing, and machine learning are combined to process data from different perspectives. [0003] The specific process of data mining includes the following steps: business understanding, data understanding, data preparation, model building, model evaluation, and model deployment. [0004] In the data preparation process, the acquired raw data needs to be preprocessed. The original data is the wide table data stored in the database or data warehouse, as shown in Table 1, the original data includes missing values ​​...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F16/21G06F16/2458G06F16/28G06N20/10
CPCG06F16/2465G06F16/26G06Q40/08G06N20/10G06N20/00G06F16/24575G06V40/172
Inventor 李辰谭卫国汪芳山
Owner HUAWEI TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products