Method and device for expanding query, search engine system
An extended query and indexing technology, applied in the field of search query, can solve uncertain problems and achieve the effect of ensuring diversity and good retrieval effect
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0048] refer to figure 1 , is a flow chart of the first embodiment of the method for expanding a query.
[0049] S101, counting words that co-occur with the query word.
[0050] Counting all the words that co-occur with the query word refers to counting which words a word appears in a webpage (or an article) at the same time. In practical applications, a preferred statistical method is: to build an index with all the query words that have appeared as keywords, and the index content is the words that appear together with the query words.
[0051] refer to figure 2 , is the index diagram. The index is an inverted index structure, each keyword in the index is a query word, and the index content corresponding to each keyword is the word that co-occurs with the query word. These co-occurring words may originate from multiple web pages. For example, for a certain query word, the co-occurring words are A, B, C, D, wherein words A and B appear simultaneously with the query word ...
Embodiment 2
[0076] refer to Figure 4 , is a flow chart of the second embodiment of the method for expanding a query. Wherein, S401-S404 are the same as S101-S104 in Embodiment 1, and will not be described in detail here.
[0077] S401, counting all words co-occurring with the query word;
[0078] In the search engine system, to accomplish this, a very large database is required. In the web search database, the entire database is a collection of all web pages that users can retrieve. To do this, the requirements for computing power are very large. To solve this problem, this embodiment adopts a distributed computing method, and distributes a computing task to a computer cluster for computing, thereby improving processing efficiency.
[0079] S402, classifying all co-occurring words;
[0080] S403, in each word class, select the most representative word and name it;
[0081] S404, using the most representative words of each category as related query words of the query word;
[0082] ...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com