Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Universal machine learning data analysis platform

A data analysis and machine learning technology, applied in the computer field, can solve problems such as low degree of freedom of operation, network bandwidth limitation, unfriendly research and development personnel, etc., and achieve the effect of improving work efficiency and operating efficiency

Active Publication Date: 2017-05-31
FUJIAN YIRONG INFORMATION TECH +3
View PDF5 Cites 80 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, commercial companies, enterprises, research institutes, etc. have complex data and require high performance, high flexibility, high reusability, and fast iterative data analysis tasks. Due to the overly simple automation system, these requirements are often not met.
[0012] As mentioned above, there are two main technical solutions for existing data analysis systems. When pursuing high performance and high accuracy, professionals usually need to spend a lot of manpower to build a specific system. The system is limited to the current use scenario and cannot be transplanted. Maintenance and upgrade costs are high; when pursuing simplicity and ease of use, automatic detection of data sets is usually used to automatically select the method of model training and analysis
This method is only suitable for the rough analysis purposes of ordinary users. It is not friendly to research and development personnel, and has a low degree of freedom of operation. It is not suitable for enterprises and scientific research units that require high accuracy and high performance.
[0013] In addition, for a data analysis platform with high performance and strong versatility, its machine learning training often requires a large amount of data input, and the capacity required for process data is also very large. How to transmit data efficiently is also a difficulty of this system
If you directly transfer data between modules, there are several problems: 1. Network bandwidth limitation. If multiple groups of online training are performed at the same time, large data transmission will greatly reduce operating efficiency without effective scheduling.
2. To transmit data directly, the modules need to know the communication address of each other, which is not conducive to distributed deployment

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Universal machine learning data analysis platform
  • Universal machine learning data analysis platform
  • Universal machine learning data analysis platform

Examples

Experimental program
Comparison scheme
Effect test

Embodiment approach 1

[0118] Manually classify a small part of W into negative reports and other reports related to electric power companies in Fujian, and construct the Fujian regional classification model E through the analysis platform.

[0119] However, due to the overly detailed definition of this method, in actual experiments, the classification effect is poor, and it lacks reusability. If you want to build a model for the Beijing area, you need to re-screen the negative reports related to the power companies in the Beijing area.

Embodiment approach 2

[0121] Manually classify a small part of W into Fujian and non-Fujian categories, build a classification model (using an integrated learning model based on a decision tree), and build a Fujian regional classification model A through the analysis platform.

[0122] Manually classify a small part of W into categories related to electric power enterprises and irrelevant to electric power enterprises, construct a classification model (using a classification model based on the Ridge regression model), and build a classification model B related to electric power enterprises through the analysis platform.

[0123] Manually classify a small part of W into negative reports and non-negative reports, build a classification model (using a neural network model), and build a negative report classification model C through the analysis platform.

[0124] Connect A, B, and C in series to build a classification model D of negative reports related to electric power companies in Fujian.

[0125] ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a universal machine learning data analysis platform. The universal machine learning data analysis platform comprises an interface module, a data storage module, a preprocessing module, a feature extraction module, a feature conversion module, an algorithm module and a selection optimization module, wherein the feature extraction module extracts feature parameters from to-be-analyzed data according to the feature parameters set by a user; the feature conversion module is used for converting features set by the user to be in a representation form required by the user; the algorithm module comprises a plurality of algorithm models for the user to select and build, and the user builds at least one group of models; the selection optimization module selects out an optimal model and an optimal parameter from the built models and then stores the optimal model; and data generated by the modules is stored in the data storage module. According to the platform, the user can freely combine and use the modules and the algorithm models, and also can build a complex model and quickly and iteratively develop a novel analytic model, so that the working efficiency is greatly improved.

Description

[0001] 【Technical field】 [0002] The invention relates to the computer field, in particular to a general machine learning data analysis platform. [0003] 【Background technique】 [0004] Today is undoubtedly an era of data explosion, whether countries, organizations or individuals are continuously producing data. Simply put, the technology of data analysis is to collect, process, clean, and statistically calculate data for the purpose of discovering useful information, knowledge, and insights to support decision-making. For example, the United Nations will allocate humanitarian aid funds based on the GDP of each country, the Federal Reserve will decide whether to raise interest rates based on the employment index, Fujian Province will predict typhoons based on weather data, and so on. Smart people have obtained data through various means, and based on the processing, processing and analysis of these data, they can guide future decisions. It can be said that data analysis tec...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06N99/00
CPCG06N20/00
Inventor 陈予言倪时龙苏江文王秋琳
Owner FUJIAN YIRONG INFORMATION TECH
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products