Data table automatic join recommendation method based on field semantics

A recommendation method and data table technology, applied in digital data processing, natural language data processing, semantic analysis, etc., can solve the problems of data content considerations, lack of most algorithms, etc., and achieve the effect of improving the level of intelligence

Pending Publication Date: 2021-11-19
ZHEJIANG LAB
View PDF0 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Building an association model based on semantic analysis of multidimensional data has certain references, but most algorithms lack the consideration of the data content itself, especially the potential correlation hidden in the data content and its distribution under different data types.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Data table automatic join recommendation method based on field semantics
  • Data table automatic join recommendation method based on field semantics
  • Data table automatic join recommendation method based on field semantics

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030] In order to make the purpose, technical solution and technical effect of the present invention clearer, the present invention will be further described in detail below in conjunction with the accompanying drawings.

[0031] Such as figure 1 As shown, a kind of data table automatic join recommendation method based on field semantics of the present invention comprises the following steps:

[0032] Step 1. When the two tables are connected to the join node, the auto join recommendation is triggered. Select 5000 non-empty records from the databases of the two tables respectively, and combine the fields in the two data tables to be joined in pairs to calculate the similarity collection of values;

[0033] Step 2, first infer the semantic type of the field, including 13 types including latitude and longitude, country, province, city, zip code, IP address, URL, email, telephone, ID card, passport, category and empty; among them, latitude and longitude, IP address, Regular ma...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to the field of multi-dimensional data analysis, in particular to a data table automatic join recommendation method based on field semantics. The method comprises the following steps of: 1, combining fields in two to-join data tables in pairs as a set for calculating similarity values; 2, inferring the semantic type of the field; 3, judging whether the data types and the semantic types of the two fields are consistent or not, then judging whether the names of the two fields are consistent or not, and then judging whether enumeration classes exist in the values of the two fields or not; 4, respectively calculating the similarity of the field names and the similarity of the field values, and then obtaining a matching coefficient through weighted summation, namely the similarity of the two fields; and 5, sorting the scores of the similarity of all the fields from high to low, and outputting and extracting the first 20 items as recommendation. According to the method, the join connection clauses are recommended by analyzing the field names and the field values of the data tables, so that a user is more accurately and comprehensively helped to find associated information hidden by multi-dimensional data, and the intelligent level of a big data analysis system is effectively improved.

Description

technical field [0001] The invention relates to the field of multidimensional data analysis, in particular to a data table automatic join recommendation method based on field semantics. Background technique [0002] In a multidimensional data analysis system, the join of two tables is a frequent and common operation. By analyzing the field names and values ​​of the data tables, the join clauses of the join are recommended to help users complete the join operation and improve the intelligence level of the system. [0003] Multidimensional data association technology has become a common operation and basic means in the field of big data analysis. How to effectively integrate multidimensional data from different sources, different organizations, and design specifications, and even lack data dictionaries, and establish a unified data model is of great importance to today's It is very important for data analysis tasks. Although the method of manually screening and matching the d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/30G06F40/289G06K9/62G06N3/04G06N3/08
CPCG06F40/30G06F40/289G06N3/08G06N3/045G06F18/22G06F18/2415
Inventor 罗实李炜铭王永恒
Owner ZHEJIANG LAB
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products