Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Network table semantic recovery method

A web form and recovery method technology, applied in special data processing applications, instruments, unstructured text data retrieval, etc., can solve the problems of inability to obtain uniquely determined column labels, low accuracy, and low accuracy.

Inactive Publication Date: 2015-07-22
BEIJING JIAOTONG UNIV
View PDF2 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] The disadvantages of the method for restoring semantics of the network table in the above-mentioned prior art are: due to the relatively large scale of the network table to be restored, the number of tuples in the network table is huge, and the calculation amount of processing such as parallel computing in this method is very large. Large, the accuracy of the recovery results obtained is not high, and it is often impossible to obtain a unique column label for a certain column of data, and multiple possible results are obtained when the entity column is detected, and the accuracy is not high
The robustness of this method is poor, and the accuracy rate is very low when dealing with numerical data.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Network table semantic recovery method
  • Network table semantic recovery method
  • Network table semantic recovery method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] Embodiments of the present invention are described in detail below, examples of which are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.

[0066] Those skilled in the art will understand that unless otherwise stated, the singular forms "a", "an", "said" and "the" used herein may also include plural forms. It should be further understood that the word "comprising" used in the description of the present invention refers to the presence of said features, integers, steps, operations, elements and / or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, elements, components, and / or groups thereof. It will be understoo...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a network table semantic recovery method. The method comprises the steps that based on a Probase lexeme database, preliminary semantic recovery is conducted on a network table to be recovered, and a candidate concept set of each column in the network table is obtained; according to combination distances among different tuples in the network table, each initial clustering center in a clustering algorithm is determined, the tuples in the network table are involved into clusters where the initial clustering centers are located, the clustering centers of the clusters are adjusted, and according to the final clustering center of the clusters, a network table after the shrinkage is conducted is obtained; according to the candidate concept set of each column in the network table and the network table after the shrinkage is conducted, column tags and column entities of all the columns of the network table are recovered out. According to the method, by selecting the initial clustering centers and calculating the similarity based on the combination distances, the K-means clustering algorithm can be improved, the scale of the network table is effectively shrunk, the complexity to fulfill a task is reduced, and the accuracy of recovering the column tags and the column entities of the network table is improved.

Description

technical field [0001] The invention relates to the technical field of semantic restoration, in particular to a semantic restoration method of a network table. Background technique [0002] The structural information in the table is of great value. You can use the schema and entity columns of the table to find related data tables and fuse them together. You can also exploit the table's schema information to explore binary relationships between different columns in the table. There are a large number of tabular data in the Internet, but most of these web tables lack structural information such as table headers and entity columns, which makes it impossible to use these high-quality structured data in web page data retrieval and data fusion. To solve this problem, people have launched different types of semantic libraries to assist in recovering the structural information of tables. [0003] In the semantic database Freebase, the data is organized in a graph structure of node...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/35
Inventor 王宁刘华西
Owner BEIJING JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products