Method for catching limit word information, optimizing output and input method system
A technology for limiting information and feature information, applied in the field of computer character input data processing, can solve the problems of reducing input efficiency, troublesome user input, increasing the number of user candidates, etc., to optimize the character output process and improve the effect of intelligence
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0054] Referring to FIG. 1 , a method embodiment 1 for obtaining information on restricted words is shown, which may specifically include:
[0055] Step 101, obtaining a target word;
[0056] The process of obtaining the target word can be obtained from the Internet, that is, directly obtained from the Internet corpus (for example, Internet web page collection or search keyword collection, etc.) through statistics and screening, and can also be obtained from the existing thesaurus. It does not need to be limited, as long as a target word set can be obtained; as for the range of the set, those skilled in the art can set it according to actual needs.
[0057] Preferably, for the obtained set of target words, an optimization step may also be included, using some attributes of the target words to remove some vocabulary, so as to further narrow the range. For example, words whose Internet word frequency or word frequency in the thesaurus is less than or equal to a preset threshold...
example 1
[0064] The feature information is: the word at the beginning of the target word is used as the characteristic value of the beginning of the word in the preset corpus, and the word at the end of the target word is used as the characteristic value of the end of the word in the preset corpus;
[0065] The preset condition for judging is: whether there is at least one eigenvalue among the above-mentioned eigenvalues and whether it belongs to a preset range.
[0066] For example, for the word "quantity" in "quantity will" seldom appears at the beginning of a word, if its frequency of occurrence of the beginning of a word is less than or equal to the preset threshold, then "quantity will" can be determined as a restricted word.
[0067] Of course, if the target word is composed of three or more characters, it is also possible to determine the feature value of the word at a certain position in the word in the same position in the word in the preset corpus.
example 2
[0069] The feature information is: the feature value of the linguistic collocation relationship of each single-word and / or multi-word contained in the target word in the preset corpus;
[0070] The preset condition for judging is: whether at least one of the above-mentioned feature values belongs to a preset range.
[0071] The linguistic collocation relationship may include multiple matching relationships such as collocation parameters between words, collocation parameters between words and parts of speech, and collocation parameters between parts of speech and parts of speech. Those skilled in the art may select or combine the various matching relationships described above according to actual needs.
[0072] For example, for the word "is to play", "yes" is followed by a verb, such a collocation relationship is rare in linguistics, so it can be obtained that its collocation feature value is less than or equal to the preset threshold, then it can be determined that "yes" Pl...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com