Search engine-oriented error correction method and system of Chinese and English mixed querying
A search engine and query string technology, which is applied in the field of search engine-oriented Chinese-English mixed query error correction, can solve problems such as insufficient support, and achieve the effect of improving the accuracy of error correction
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0053] Such as figure 1 as shown in Figure 2~5 As shown, a search engine-oriented Chinese-English mixed query error correction method includes the following steps:
[0054]S1. Using crawler technology to crawl Internet webpage content;
[0055] S2. Using the webpage content and search logs crawled in step S1 as corpus to construct a language model, and construct a pinyin-based dictionary tree, an English index table and a word segmentation dictionary;
[0056] S3. For the query string input by the user, first use the language model to evaluate it and calculate its rationality probability, if its rationality probability is lower than the set threshold A, or the number of search results obtained based on the query string is less than the threshold B, then proceed to the error correction process of step S4;
[0057] S4. (1) If the query string only contains Chinese, such as figure 2 As shown, the following error correction process is performed:
[0058] S101. If the input ...
Embodiment 2
[0073] This embodiment provides a system applying the method of Embodiment 1, such as figure 1 As shown, the specific scheme is as follows:
[0074] Including learning module, error correction module and training module;
[0075] Wherein the learning module is used to dig out new words to the corpus, and add the new words that have been dug out to the word segmentation dictionary, and the word segmentation dictionary is used for the segmentation of the query string in step S3;
[0076] The training module is used to build a language model based on the corpus, and build a pinyin-based dictionary tree, English index table and word segmentation dictionary;
[0077] The error correction module is used for error correction processing.
[0078] In a specific implementation process, the error correction module includes a Chinese error correction submodule, a Chinese and letter error correction submodule, an English and Pinyin error correction submodule, wherein the Chinese error co...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com