Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A Book Semantic Retrieval Method Based on Content Structure

A content structure and book technology, which is applied in the field of semantic association retrieval of book content, can solve problems such as unsatisfactory recall rate and precision rate, inconvenience, and insufficient expression of book content information titles and subject terms. , to achieve the effect of improving the mechanical matching of keywords and increasing the recall rate

Active Publication Date: 2018-11-16
杭州淘艺数据技术有限公司
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, users of digital libraries, Dangdang.com, Amazon and other book sales networks mainly search for books based on keyword queries. Users’ query intentions cannot be reasonably understood, and book content information cannot be fully expressed through titles and keywords. Users often need to Through a large number of manual screening to select the target, the setting needs to be searched twice, which brings great inconvenience to the user, and the recall rate and precision rate are not satisfactory.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Book Semantic Retrieval Method Based on Content Structure
  • A Book Semantic Retrieval Method Based on Content Structure
  • A Book Semantic Retrieval Method Based on Content Structure

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0014] In order to make the specific features and advantages of the present invention more comprehensible, the present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments.

[0015] figure 1 It is a flowchart of the method of the present invention, such as figure 1 Shown, the present invention comprises the following steps:

[0016] Step (1) The user enters the search sentence and obtains several keywords through domain dictionary preprocessing, and performs synonym expansion on the domain ontology for the above keywords to obtain the user's initial query keyword set T1. For example, when users input Chinese word segmentation technology, Chinese word segmentation and word segmentation technology need to be added to the extended words.

[0017] Step (2) Map the query keyword set T1 in the domain ontology, and calculate the semantic correlation between the mapped concept and other concepts according to the s...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a semantic association retrieval method based on book content structures. In consideration of the difference of contribution degrees of the theme name, catalogue and abstract of a book to the main content of the book and the difference of importance degrees of the chapter titles and section titles of the catalogue in reflecting the content of the book, the book content structures, including the book title, the catalogue structure and the abstract structure, are subjected to Chinese word segmentation and stop word removal processing through a domain dictionary and subjected to synonym expansion through domain ontology to obtain a group of keywords, different weights are given to the keywords in different structures of the book, and the book content with the structure weights is stored through a vector space model. Semantic association expansion is performed on a query word input by a user through the domain ontology, and the similarity between a user retrieval intention vector and a book content structure vector is calculated to more accurately acquire a book most associated with user query content. The method increases the recall ratio and the precision ratio and improves mechanical keyword matching in book retrieval in the prior art.

Description

technical field [0001] The invention relates to the field of digital books, in particular to a method for performing semantic association retrieval on book contents. Background technique [0002] The core competitiveness of a digital library is the accurate retrieval of digital books, and the core of accurate retrieval is an accurate understanding of the contents of books and the user's retrieval intentions. At present, the research on semantic retrieval of book contents lags far behind the actual needs. [0003] At present, users of digital libraries, Dangdang.com, Amazon and other book sales networks mainly search for books based on keyword queries. Users’ query intentions cannot be reasonably understood, and book content information cannot be fully expressed through titles and keywords. Users often need to Through a large number of manual screening to select the target, the setting requires secondary retrieval, which brings great inconvenience to the user, and the recall ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/30
CPCG06F16/951
Inventor 王强宁吴夏
Owner 杭州淘艺数据技术有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products