Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Multi-source data deep fusion method based on deep learning

A technology of deep learning and multi-source data, applied in the field of deep fusion of multi-source data based on deep learning, can solve problems such as processing performance degradation and affecting data processing accuracy, and achieve the effect of improving tolerance

Active Publication Date: 2020-10-13
STATE GRID ZHEJIANG ELECTRIC POWER CO MARKETING SERVICE CENT +2
View PDF4 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The existence of dirty data greatly affects the processing accuracy of the machine for data, resulting in a decline in processing performance

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-source data deep fusion method based on deep learning
  • Multi-source data deep fusion method based on deep learning
  • Multi-source data deep fusion method based on deep learning

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0040] Specifically, the embodiment of this application proposes a deep learning-based multi-source data deep fusion method, such as figure 1 shown, including:

[0041] 11. Obtain the relational data tables to be fused including the first data table and the second data table;

[0042] 12. Build a deep learning model, import training data into the deep learning model, perform word vectorization processing on the content in the fusion relational data table, and perform pattern matching on the processed data;

[0043] 13. Based on the similarity between the corresponding entities of the data, the data in the fusion relational data table is to be hierarchically sampled, and the sampled data is imported into the preset structural model for integration processing based on word vectors, and the trained data points are obtained. Bucket model, based on the data bucket model for entity-based data bucket processing;

[0044] 14. Determine whether the data in each bucket refers to the s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The embodiment of the invention provides a multi-source data deep fusion method based on deep learning. The method comprises the steps of obtaining a to-be-fused relational data table; constructing adeep learning model, importing training data to carry out word vectorization processing on contents in the to-be-fused relational data table, and carrying out mode matching on the processed data; performing stratified sampling on the data in the to-be-fused relational data table based on the similarity between entities corresponding to the data, importing the sampled data into a preset structure model to perform integration processing based on word vectors to obtain a trained data bucket model, and performing entity-based data bucket processing based on the data bucket model; and judging whether the data in each bucket refers to the same entity, and performing data fusion on the data referring to the same entity to obtain a data table formed by the fused data. According to the invention, character string data is modeled in a word vector mode, and the method can be used for modeling texts and semantics of the character string at the same time, so that the tolerance to dirty data is improved.

Description

technical field [0001] This application belongs to the field of data processing, and in particular relates to a method for deep fusion of multi-source data based on deep learning. Background technique [0002] Multi-source data deep fusion refers to the use of deep learning methods to fuse multi-source structured data to facilitate data scientists to analyze. In this application, fusion refers to the discovery of the same entity in the real world (where each tuple in the table refers to an entity) in multi-source data, also known as entity matching. For example, different expressions of the same mobile phone are one of the important topics in the field of data science. Using deep learning, it is possible to quickly and accurately predict multi-source dirty data, tap its value, and better solve the 4V (Volume, Velocity, and Variety) of big data. and value (Value) challenges. [0003] Data in the real world is often dirty. For example, "Tsinghua University" may have multipl...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F16/2458G06F16/28G06K9/62
CPCG06F16/2465G06F16/285G06F18/25
Inventor 李国良柴成亮李熊李飞飞叶翔裘炜浩丁麒杨世旺金王英章晓明李舜
Owner STATE GRID ZHEJIANG ELECTRIC POWER CO MARKETING SERVICE CENT
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products