Method and system for automatically correcting character strings

A string, automatic technology, applied in the fields of electronic digital data processing, natural language data processing, instruments, etc., can solve the problems of high cost of anti-fraud risk control, reduced e-commerce efficiency, and difficulty in automatic correction.

Active Publication Date: 2014-09-10
SHANGHAI CTRIP COMMERCE CO LTD
View PDF4 Cites 4 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The technical problem to be solved by the present invention is to overcome that in the prior art, it is impossible to automatically and efficiently make a relatively accurate judgment on the authenticity or accuracy of the character string information input by the user, and

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and system for automatically correcting character strings
  • Method and system for automatically correcting character strings

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0068] In the character string automatic correcting method of the present embodiment, in a character string database, store a plurality of verified character strings and a plurality of preset first-type words, and each verified character string includes several first-type words. word. refer to figure 1 As shown, the string automatic correction method includes the following steps:

[0069] S 1 , from the plurality of character strings, extracting other words separated by the first type of words as the second type of words, and the phrase formed by each of the second type of words and the immediately adjacent first type of words together as a preset phrase, and then A keyword database is generated, and there are multiple first-type words, second-type words, preset phrases, and a sequence of words in the keyword database. The sorting order of setting;

[0070] S 2 , generate a phrase permutation statistical table, the permutation probability that each preset phrase appears a...

Embodiment 2

[0091] Compared with Embodiment 1, the character string automatic correction method of the present embodiment differs only in that:

[0092] Also store the weight value of each first category word in this character string database, S 9 by S 9a Substitute, S 9a for:

[0093] Query the phrase permutation statistics table to obtain the permutation probabilities of the effective phrases at the beginning of the output string and adjacent valid phrases, and calculate the weighted average of the obtained permutation probabilities as the accuracy, where the weight of each permutation probability is equal to the output character The weight value of the first type of word in the first effective phrase in the string, or the weight value of the first type of word in the subsequent effective phrase in the adjacent effective phrase.

[0094] And, in S 6 Then execute S 61 , S 61 For: select the phrase that includes the first category of words from the invalid word part as the unknown p...

Embodiment 3

[0103] refer to figure 2 As shown, the character string automatic correction system of the present embodiment comprises:

[0104] Character string database module 1, is used for storing verified multiple character strings and a plurality of preset first-type words, and each verified character string includes several first-type words;

[0105] The keyword database module 2 is used to extract other words separated by the first type of words as the second type of words from the plurality of character strings, and each of the second type of words and the next first type of words to form together The phrase is used as a preset phrase, and then a keyword database is generated, and the first type of words, the second type of words, the default phrase and a row of words are recorded in the keyword database, and the sequence of the words is The preset arrangement order of each first-class word;

[0106] Phrase permutation statistical module 3, for calculating and recording the permu...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a method and system for automatically correcting character strings. The method for automatically correcting the character strings includes the following steps that a keyword database in which first kind words, second kind words, preset word groups and a word ranking sequence are recorded is generated; a word group ranking statistical table is generated according to the keyword database; the input character string is read; first kind words are selected from the input character string, and the character string is divided into keyword groups; effective word groups, words to be combined and ineffective words are selected from all the keyword groups; effective word groups are formed based on the words to be combined and according to the word group ranking statistical table; the output character string is generated; accuracy is calculated according to the word group ranking statistical table and is output. According to the method and system for automatically correcting the character strings, the concept partially based on word bank matching and partially based on a statistical probability is adopted, accuracy judgment can be conducted on input character string information, clerical errors generated in the user input process can be well recognized and automatically corrected, and therefore running efficiency of electronic commerce is improved.

Description

technical field [0001] The invention relates to a method and system for automatically correcting character strings. Background technique [0002] As e-commerce plays an increasingly important role in people's daily life, the authenticity and accuracy of user input information in e-commerce has also become the focus of many e-commerce companies. In e-commerce, it often involves filling in some information in a conventional format, such as delivery address and other information, and such information usually plays an important role in the interaction and communication between merchants and users. However, among the massive amount of information input by users, it is inevitable that some harassing information, namely false information, will appear. These two reasons make the authenticity and accuracy of some input information questionable, which hinders further communication between merchants and users or the conduct of transactions. [0003] In fact, for small errors caused b...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F17/30G06Q30/00
CPCG06F16/90344G06F40/232
Inventor 刘利黄晓君
Owner SHANGHAI CTRIP COMMERCE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products