A method and device for acquiring an entry

An acquisition method and entry technology, which is applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of insufficient collection of entity entries, achieve effective knowledge search, and improve the effect of structured data materials

Active Publication Date: 2017-10-17
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF4 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In view of this, the present invention provides a method and device for obtaining entries, which can guide users to create new words by using the existing thesaurus to mine entity entries, solve the problem of insufficient collection of entity entries in the encyclopedia database, and facilitate more effective knowledge search

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method and device for acquiring an entry
  • A method and device for acquiring an entry
  • A method and device for acquiring an entry

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0068] figure 1 It is a flow chart of the method for obtaining entries provided by this embodiment, such as figure 1 As shown, the method includes:

[0069] Step S101. Obtain a set of existing entries of the same category in the entry database.

[0070] The entry database may be an encyclopedia entry database, an input method entry database and other classified entry databases. In the present invention, the encyclopedia entry database is used as an example for illustration.

[0071] The classification can adopt the original categories of the classification entry library, including: songs, movies, characters, nature, culture, geography, history, life, society, art, economy, science and technology, sports and other categories, or can be used for existing Some entries are divided into categories using existing classification or clustering methods (such as Bayesian classification method, decision tree method, support vector machine SVM, etc.).

[0072] Obtain the set of existin...

Embodiment 2

[0088] Figure 4 It is a flow chart of the method for obtaining entries provided by this embodiment, such as Figure 4 As shown, the method includes:

[0089] Step S401. Acquiring a collection of existing entries of the same category in the entry database.

[0090] Step S402 , search using the acquired set of existing entries to obtain the anchor text containing the existing entries, and record the location of the webpage where the anchor text of the existing entries is located.

[0091] Step S403 , according to the recorded webpage position, extract the anchor text whose contextual distance from the anchor text of the existing entry satisfies the preset requirement at the corresponding position.

[0092] The above steps S401 to S403 are correspondingly the same as the steps S101 to S103 in the first embodiment, and will not be repeated here.

[0093] Step S404 , comparing the extracted anchor text with the term database to obtain unrecorded anchor text.

[0094] Since the...

Embodiment 3

[0125] Figure 5 It is a schematic diagram of the device for acquiring entries provided in this embodiment. Such as Figure 5 As shown, the device includes:

[0126] Existing entry obtaining module 501 is used to obtain the collection of existing entries of the same category in the entry database.

[0127] The entry database may be an encyclopedia entry database, an input method entry database and other classified entry databases. In the present invention, the encyclopedia entry database is used as an example for illustration.

[0128] The classification can adopt the original categories of the classification entry library, including: songs, movies, characters, nature, culture, geography, history, life, society, art, economy, science and technology, sports and other categories, or can be used for existing Some entries are divided into categories using existing classification or clustering methods (such as Bayesian classification method, decision tree method, support vector ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The present invention provides a method and device for obtaining an entry, wherein the method includes: obtaining an existing entry collection of the same category in the entry database; searching by using the obtained collection of existing entries to obtain the The anchor text of the existing entry, and record the webpage position where the anchor text of the existing entry is located; according to the recorded webpage location, extract the context between the anchor text and the anchor text of the existing entry at the corresponding position Anchor text that meets preset requirements. The acquisition method and device provided by the present invention utilize the existing thesaurus to mine entity entries, can guide users to create new words, solve the problem of insufficient collection of entity entries in the encyclopedia database, and facilitate more effective knowledge search.

Description

【Technical field】 [0001] The invention relates to the technical field of Internet information processing, in particular to a method and device for acquiring entries. 【Background technique】 [0002] With the continuous development of information and network technology, people increasingly search for various knowledge and information through the Internet. The encyclopedia website is a platform where all Internet users can browse, create, and improve content equally, such as Baidu Encyclopedia, Wikipedia, Interactive Encyclopedia, etc., allowing Internet users to find the comprehensive, accurate and objective content they want through the encyclopedia website. The definitional information of , which can be used by other users to query and browse similar topics, so as to provide corresponding knowledge or reference. [0003] An entry is the basic division unit of the content contained in the encyclopedia website. An entry has one or more single themes, which are used to explain...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
Inventor 李永强
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products