Method for automatic optimal model selection based on big data

An automatic selection and optimal model technology, applied in data mining, electrical digital data processing, special data processing applications, etc., can solve time-consuming problems, achieve the effect of saving time and improving modeling efficiency

Inactive Publication Date: 2018-07-10
GUANGDONG KINGPOINT DATA SCI & TECH CO LTD
View PDF0 Cites 6 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Although there are some algorithms for automatically selecting the optimal model, it takes a long time, so it is necessary to optimize these algorithms

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method for automatic optimal model selection based on big data

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0023] The above and other technical features and advantages of the present invention will be described in more detail below in conjunction with the accompanying drawings.

[0024] Such as figure 1 As shown, a flow chart of a method for automatically selecting an optimal model based on big data provided by the present invention, the method includes the following steps:

[0025] Step S1: Classify the mining targets.

[0026] Specifically, classify the mining target, determine which category the mining target belongs to, and list the data mining algorithms that may be used.

[0027] Step S2: Use information gain to perform fast feature selection on the entire data set, and eliminate features that are not relevant to the data mining process.

[0028] Specifically, before the data mining process, information gain is used to perform fast feature selection on the entire data set, and some of the features that are not relevant to the subsequent data mining process are excluded.

...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a method for automatic optimal model selection based on big data. The method comprises: step S1, classifying mining targets; step S2, using information gain to perform rapid feature selection on a whole data set; step S3, establishing a training set and a verification set; step S4, selecting an effective data mining algorithm and a parameter combination thereof; step S5, using a Bayes optimization method to select effective parameter combinations of each algorithm; step S6, selecting an optimal data mining algorithm K; step S7, using cross validation, selecting and determining a parameter value combination of the data mining algorithm K, to obtain a final model; step S8, if a result obtained by the model is relatively poor, repeating steps S2-S7, selecting the optimal model again, until a model result is satisfied; if the result is relatively satisfied, outputting the model. The method can save time consumed by automatic modeling, and modeling efficiency is improved, and the optimal algorithm can be searched rapidly from large quantity of algorithms, and parameter combinations in the optimal algorithm are selected by cross validation.

Description

technical field [0001] The invention relates to the field of data mining, in particular to a method for automatically selecting an optimal model based on big data. Background technique [0002] Big data refers to a collection of data that cannot be captured, managed, and processed by conventional software tools within a certain period of time. It is a massive, high-growth rate that requires a new processing model to have stronger decision-making power, insight and discovery, and process optimization capabilities. and diverse information assets. Data mining generally refers to the process of searching for information hidden in a large amount of data through algorithms. In the era of "big data", in the face of massive data, it is urgent to convert these data into useful information and knowledge, and the obtained information and knowledge can be widely used in various industries such as business management and market analysis. Data mining includes a large number of different...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30
CPCG06F16/2453G06F16/2465G06F2216/03
Inventor 邹立斌李青海侯大勇简宋全
Owner GUANGDONG KINGPOINT DATA SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products