Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Entity linkage algorithm based on graph model

A graphical model, entity technology, applied in computing, instruments, electronic digital data processing and other directions, can solve the problem of ignoring the semantic information of the target entity, manpower and time, and semantic information mining.

Inactive Publication Date: 2015-11-11
EAST CHINA NORMAL UNIV
View PDF2 Cites 26 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The disadvantage is that this method requires manual labeling of the data set, which requires a lot of manpower and time, and does not mine some semantic information describing the target entity in the article, but treats all entities appearing in the article equally. The role of semantic information of the target entity is ignored
The advantage of the unsupervised learning method is that it does not need to label data, which saves a lot of manpower and time. The disadvantage is that the features are not easy to integrate

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Entity linkage algorithm based on graph model
  • Entity linkage algorithm based on graph model
  • Entity linkage algorithm based on graph model

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0042] See attached figure 1 , the present invention utilizes the Wikipedia knowledge base to form candidate entities, then uses LDA to construct semantic features between entities, uses Wikipedia's link structure to form a graph model for the relationship between entities and entities, and integrates related semantic features into the graph model Among them, the PageRank algorithm is used to rank entities to obtain the result of entity linking. The entity linking algorithm includes the following specific steps:

[0043] (1), naming dictionary

[0044] Use the JWPL tool to convert the irregular data downloaded from Wikipedia into regular data and then import it into the Wikipedia offline database to obtain the features of entity pages, redirected pages, disambiguation pages and hyperlinks in Wikipedia, and combine different types of features into Entities with different names, and map these entities by hash to build an offline dictionary.

[0045] Wikipedia provides a series...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention discloses an entity linkage algorithm based on a graph model. The entity linkage algorithm based on a graph model is characterized by comprising: forming a candidate entity by using the Wikipedia knowledge base; constructing a semantic feature between the entities by using LDA; constructing relationships between entities based on linkage structures of Wikipedia to form a graph model; and integrating related semantic features into the graph model; and ranking the entities by using the PageRank algorithm to obtain an entity linkage result, which specifically comprises steps of calculation and integration of a naming dictionary, a candidate entity set, related features, construction of the graph model, and ranking of candidate entities. Compared with the prior art, the entity linkage algorithm based on a graph model has the advantages of being good in entity feature integration and high in reliability of the entity linkage result; data is downloaded by using Wikipedia, so that no additional costs are needed, and especially data sets do not need to be noted manually; and the method is simple, convenient in usage, and saves time and efforts.

Description

technical field [0001] The invention relates to the technical field of information base text processing, in particular to an entity linking algorithm based on a graph model. Background technique [0002] The research objects of the entity linking task are entity nouns that include three types of tasks, institutions, and places. The research goal is: given a query that contains the target entity and the background documents that support the query word, combine the secondary target entity with the existing knowledge The entities with mutual referential relationship in the knowledge base are correctly connected. If there is no entity node connected with the query entity in the knowledge base, it is called a non-KB entity, and such non-KB entities are clustered. Add the entities required by the universal query to the knowledge expansion, and expand and maintain the knowledge base. Therefore, on the one hand, the entity linking task can accurately feed back the user's query resu...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G06F17/30
CPCG06F16/374
Inventor 杨燕罗念贺樑
Owner EAST CHINA NORMAL UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products