Netlist Entity Extension Methods

An extension method and network table technology, applied in the field of structured data integration on web pages, can solve problems such as entity inconsistency, low accuracy, and missing column labels

Active Publication Date: 2020-04-21
BEIJING JIAOTONG UNIV
View PDF2 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The premise of adopting this strategy is that the attributes in the network table are not related to each other. This assumption obviously ignores the connection between the attribute columns, resulting in the splitting of the table semantics, resulting in low accuracy of entity expansion and inconsistent entities. question
[0004] Most of the network tables are n-element tables. Using existing technology to split them will destroy the semantics of the tables, resulting in inconsistencies between spliced ​​entities and attributes
The web form is not standardized, there are problems such as missing column labels, and it is impossible to judge the matching relationship between tables based on the column labels
Entities are ambiguous. Entities with the same name may have different semantics. Only relying on entities to judge the matching relationship between tables will lead to semantic conflicts between matching tables.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Netlist Entity Extension Methods
  • Netlist Entity Extension Methods
  • Netlist Entity Extension Methods

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0094] The embodiment of the present invention obtains the theorem that solves the problem by analyzing and defining the problem, as follows in detail:

[0095] 1 Problem Definition

[0096] In recent years, the problem of entity expansion has attracted more and more research scholars' attention. The Infogather system proposed by Mohamed Yakout et al uses the method of indirect matching to expand the entity, and Oliver Lehmberg et al proposes the SearchJoin search engine to expand the query table. In the process of entity expansion, the above methods all regard the network table as a binary table of entity-attribute, and each table has only one attribute column to be expanded.

[0097] The reality is that web tables are mostly n-grams. When a web table is divided into multiple 2-element tables for processing, the semantics of the table will be split, resulting in inconsistent entities in the resulting table. In order to ensure the consistency of entities in the result table...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides an entity expanding method of a network list. The method comprises the steps of selecting seed lists to seed groups by calculating a semantic matching value between the networklist and a query list; selecting the group list with the maximum list potential to add into the seed groups, wherein the group list and the seed groups accord with consistency matching relations and are used for rising the coverage rate of the seed groups, and when the coverage rate reaches a set threshold value, the seed groups are considered as consistency groups meeting the given coverage rate;when nodes of the consistency groups are regarded as an answer list needed by entity expansion, using the answer list to establish a final event list expanding the entity consistency. Searching for answers, the entity expanding method of the network list brings in a concept of a consistency matching relation, the consistency of the answer list is improved, and therefore the expansion of multiplesearching lists can be adapted, so that the result consistency is guaranteed, and at the same time, the high-precision and reliability of the result are also ensured.

Description

technical field [0001] The invention relates to the technical field of structured data integration on webpages, in particular to a network table entity extension method. Background technique [0002] Users usually want to obtain the information they are interested in, and they can use a large number of tables on the network as information sources, and realize it through entity expansion. Existing techniques assume that web forms are entity-attribute binary relationships. For tables with multiple columns of attributes to be expanded, the existing technology first splits these tables into several entity-attribute binary relations, and then aggregates the results of separate expansion into a complete answer. The result of this is that the semantics of the table are split during the splitting process, and the resulting table composed of split binary relations inevitably has the disadvantages of entity inconsistency and low accuracy. [0003] The InfoGather system proposed by M...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/28G06F16/2458
CPCG06F16/2458G06F16/288
Inventor 王宁孙伟娟
Owner BEIJING JIAOTONG UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products