Method for creating error-correcting database, automatic error correcting method and system
A database and error correction technology, which is applied in the field of generating error correction databases for character data, can solve problems such as poor applicability and unguaranteed accuracy, and achieve the effects of wide application, correction of input errors, and wide coverage
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0069] This embodiment is explained by taking the query log as a data source as an example. Generally, the query log can be recorded by a search engine, and the query records of each user can be separated by IP address or user login name; certainly, the query record can also be Recorded by local clients and then aggregated.
[0070] The query log may generally include input historical records of user query keywords, for example,
[0071] 10.10.1.1 Shanghai 2008-02-25.09:00:00
[0072] 10.10.1.1 Wrestle 2008-02-25.11:00:00
[0073] 10.10.1.1 Bodou 2008-02-25.12:00:09
[0074] 192.10.1.1 Wrestle 2008-02-23.13:00:00
[0075] 192.10.1.1 bodou 2008-02-23.13:00:05
[0076] 192.10.1.1 Nanjing 2008-02-23.15:00:05
[0077] Each line in the above log information represents a user query string, and a line of records includes the following information: user identification (for example, account number, nickname, IP, etc., which can generally be used to uniquely represent a user), query ...
Embodiment 2
[0087] In this embodiment, the user's input method log information is taken as an example for illustration. The input method log information may include the coded character string input by the user and the corresponding input candidate items. In this embodiment, the user input sequence information may be used to mine and obtain the required character error correction relationship, as follows:
[0088] Find whether there is a situation that the coded strings are directly adjacent, and if so, determine that the adjacent coded strings belong to a character error correction relationship, and determine that the last coded string used to input the candidate is correct.
[0089] For the user's input history, the input method log can record the information "user ID-coded string-input candidate", of course, the "user ID" is an optional record field. In the case of manual error correction by the user, the input method log may record information "user ID-encoded string-encoded string-inp...
Embodiment 3
[0093] This embodiment uses the input method log as an example for illustration. The difference from Embodiment 2 is that the input method log of this embodiment also records the relevant deletion operations of the user, such as backspace key, delete key, Esc key, replacement operation, etc. Wait. Among them, the replacement operation can be seen as a combination of a deletion operation and a re-input operation.
[0094] Under normal circumstances, the user will not use the delete operation during normal input. A typical situation is due to the user's manual error correction. Therefore, when the delete operation appears in the user's input record, it can be determined that there is a user manual error. error correction information. In this embodiment, the following analysis and mining steps can be used to obtain the character error correction relationship:
[0095] Find whether the user has applied a delete operation during the input process, and if so, determine that the en...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com