Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Full-text database accurate and efficient retrieval method for perfecting subject terms

A subject heading and database technology, applied in the field of full-text database retrieval, can solve the problems of fewer classical Chinese words, increased workload in the information retrieval stage and result processing stage, and reduced system retrieval efficiency.

Pending Publication Date: 2020-10-27
刘秀萍
View PDF0 Cites 7 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0009] The 3rd, the words included in the extended version of Cilin of the prior art are too comprehensive, and a subject word can be expanded into a collection of more than a dozen related words, and the collection of related words is used for retrieval, which will increase the workload of the information retrieval stage and the result processing stage, Reduce the efficiency of system retrieval
Moreover, classical Chinese words have been used very little in modern Chinese, and some words have low semantic correlation with modern Chinese, and in some cases there are even ambiguities.
The noise data in the related word set interferes with the search results and reduces the retrieval accuracy
[0010] The fourth is that the existing technology optimizes the keyword set to contain a lot of related words. There is a certain difference between the related words and the keyword, and the impact on the search results is also different. The default sorting algorithm of Lucene performs the same processing on all the keywords. It is obviously unreasonable to not distinguish the correlation between subject words
The sorting of the search results cannot satisfy the user's search habits, and the documents with high relevance to the subject words cannot be ranked in front of the search results.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Full-text database accurate and efficient retrieval method for perfecting subject terms
  • Full-text database accurate and efficient retrieval method for perfecting subject terms
  • Full-text database accurate and efficient retrieval method for perfecting subject terms

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0071] Below in conjunction with the accompanying drawings, the technical solution of the accurate and efficient retrieval method for the full-text database of perfect subject terms provided by the present invention will be further described, so that those skilled in the art can better understand the present invention and implement it.

[0072] The present invention proposes an accurate and efficient retrieval method for perfecting the full-text database of keywords, which mainly includes three stages of selection of long-tail keywords, expansion of keywords and simplification of keywords;

[0073] In the long-tail keyword selection stage, aiming at the problem that the user inputs a single search term or a short search sentence, and the retrieval intention is not clear, the present invention creates a long-tail keyword thesaurus based on field-related knowledge and user search logs, and designs long-tail keyword selection Algorithms, recommend users to optimize search sentence...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a full-text database accurate and efficient retrieval method for perfecting subject terms. The method comprises the steps of firstly, creating a long-tail subject term lexicon,designing a long-tail subject term selection algorithm, selecting a user to optimize a search expression, and clearing the search intention; designing a subject term synonymy expansion algorithm basedon the word forest expansion version, and expanding the subject term into a synonymy set of the subject term; and designing an improved synonym similarity calculation method based on the Known synonym tree, and calculating the similarity between the expanded subject term set and the original subject term by adopting an improved algorithm; setting an appropriate similarity threshold value throughmultiple groups of experimental statistical analysis, and simplifying the subject term set according to the similarity critical value to obtain an improved subject term set; and finally, improving a retrieval result sorting algorithm. The recall ratio, the precision ratio and the retrieval efficiency of the method are greatly improved, and good performance is shown.

Description

technical field [0001] The invention relates to a full-text database retrieval method for perfecting subject terms, in particular to an accurate and efficient full-text database retrieval method for perfecting subject terms, belonging to the technical field of full-text database retrieval. Background technique [0002] With the continuous development of digitization and informatization, search engines have become extremely important information acquisition tools in all walks of life and fields, and various new technologies for information retrieval emerge in an endless stream. [0003] After the continuous development and popularization of the Internet, information from all walks of life is accumulated in the network day by day, with various types and quantities, involving life, society, education, technology, entertainment and other aspects. It is necessary to search for valuable information in such a huge network information database. information is becoming increasingly d...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F16/31G06F16/33G06F16/338
CPCG06F16/313G06F16/3334G06F16/3338G06F16/3344G06F16/338
Inventor 刘秀萍高宏松
Owner 刘秀萍
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products