Method for catching limit word information, optimizing output and input method system
A technology that restricts information and input methods, applied in the input/output process of data processing, special data processing applications, instruments, etc., can solve problems such as reducing input efficiency, user input trouble, increasing the number of user candidates, etc., to optimize characters Output process, the effect of improving intelligence
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0052] refer to figure 1 , shows a method embodiment 1 of obtaining restricted word information, which may specifically include:
[0053] Step 101, obtain a target word;
[0054] The process of obtaining the target word can be obtained from the Internet, that is, directly obtained from an Internet corpus (for example, a collection of Internet web pages or a collection of search keywords, etc.) through statistics and screening, or obtained from an existing lexicon. It does not need to be limited, as long as one target word set can be obtained; as for the size of the range of the set, those skilled in the art can set it according to actual needs.
[0055] Preferably, for the obtained set of target words, an optimization step may also be included to remove some words by using some attributes of the target words to further narrow the scope. For example, words whose Internet word frequency or thesaurus word frequency is less than or equal to a preset threshold are removed from th...
example 1
[0062] Described feature information is: in this target word, the single word located at the beginning of the word is used as the feature value of the prefix in the preset corpus, and the single word located at the end of the target word in the preset corpus is used as the feature value of the end of the word;
[0063] The preset condition for judging is: whether there is at least one eigenvalue in the above eigenvalues and whether it belongs to the preset range.
[0064] For example, for the word "quantity" in "quantity will" rarely appear at the beginning of a word, if the frequency of its prefix is less than or equal to a preset threshold, it can be determined that "quantity will" is a restricted word.
[0065] Of course, for the target word consisting of three or more words, it is also possible to determine the feature value of a word located at a certain position in the word in the preset corpus at the same position in the word.
example 2
[0067] The feature information is: the feature value of the linguistic collocation relationship of each single-character word and / or multi-character word contained in the target word in the preset corpus;
[0068] The preset condition for judging is: whether there is at least one eigenvalue in the above eigenvalues belonging to the preset range.
[0069] The linguistic collocation relationship may include collocation parameters between words and words, collocation parameters between words and parts of speech, and collocation parameters between parts of speech and parts of speech. Those skilled in the art can select or combine the above-mentioned various matching relationships according to actual needs.
[0070] For example, for the word "is to play", "is" followed by a verb, such a collocation relationship is rare in linguistics, so it can be obtained that the collocation feature value is less than or equal to the preset threshold, then it can be determined that "yes" play"...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com