Index building method, searching method and searching result sorting method and corresponding device
A technology for index building and search results, applied in the computer field, which can solve the problems of low search accuracy, inability to identify and satisfy, and poor search effect.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0125] figure 1 The flow chart of the index building method provided by Embodiment 1 of the present invention, as shown in figure 1 As shown, perform the following steps on each captured page:
[0126] Step 101: Perform word segmentation and part-of-speech tagging on the page.
[0127] In addition, after word segmentation and part-of-speech tagging are performed on the page, the query after word segmentation can be filtered based on the stop word list, which can include: adverbs, function words, particles, interrogative words, modal particles, etc. Filter out those words with high frequency but low expressive ability in the page.
[0128] Step 102: Based on the semantic analysis, determine the entity word and the attribute word corresponding to the entity word from each word obtained after word segmentation, and mark them respectively.
[0129] In the present invention, nouns that meet the preset entity word conditions can be determined as entity words, wherein the preset e...
Embodiment 2
[0152] figure 2 The flow chart of the method for analyzing query provided by Embodiment 2 of the present invention, such as figure 2 As shown, the method includes the following steps:
[0153] Step 201: Segment the received query.
[0154] Step 202: Perform part-of-speech tagging on each word obtained after word segmentation.
[0155] For example, after receiving the query of "Andy Lau's date of birth", the word segmentation process is performed on the query to obtain two words, "Andy Lau" and "date of birth", which are both marked as nouns. The above two steps are mature technologies in the prior art and will not be described in detail.
[0156] In addition, after word segmentation and part-of-speech tagging are performed on the query, the word-segmented query can be filtered based on the preset stop word list, and the words contained in the stop word list can be filtered out. The stop word list can include : Adverbs, function words, auxiliary words, interrogative words...
Embodiment 3
[0180] After the query is analyzed as shown in Embodiment 2, only the pages corresponding to the index matching the words in the query and the tags (entity word or attribute word tags) in the query can be recalled when the page is searched and recalled.
[0181] That is, when searching, search the index for each word obtained after the word segmentation process, find the page corresponding to the index matching each word and the label of the word, and then take the intersection of the pages found by using each word.
[0182] For example, for the query of "Andy Lau's date of birth", for the words "Andy Lau" and "date of birth" obtained after word segmentation, since "Andy Lau" has been analyzed as an entity word and "date of birth" is an attribute word, when searching , find the page corresponding to the index marked with entity words for "Andy Lau", and the page corresponding to the index marked with attribute words for "Date of Birth", and the intersection of the obtained page...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com