A Chinese error correction method, device, equipment and storage medium based on mutual information

An error correction method and a technology of an error correction device, which are applied in the field of text error correction, can solve problems such as the low accuracy rate of Chinese error correction, and achieve the effect of improving the accuracy rate

Active Publication Date: 2021-04-16
MASHANG CONSUMER FINANCE CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] The existing pinyin error correction may ignore grammatical errors when dealing with word collocation errors, making the accuracy of Chinese error correction low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A Chinese error correction method, device, equipment and storage medium based on mutual information
  • A Chinese error correction method, device, equipment and storage medium based on mutual information
  • A Chinese error correction method, device, equipment and storage medium based on mutual information

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0051] The core of the present invention provides a Chinese error correction method based on mutual information, wherein, mutual information (MutualInformation) is a kind of useful information measure in information theory, characterizes the correlation between two event collections, specifically, it can Uncertainty as the amount of information in one random variable about another random variable, or as a variation of one random variable with another known random variable.

[0052] Combined with the starting point of the present invention to use pinyin to correct Chinese errors, in Chinese, there is often a certain collocation relationship between the previous word and the next word. Take "my loan is overdue" as an example, where the correct word collocation of "loan" and the pinyin "yuqi" can be "overdue" instead of "overdue". To correct the input "my loan balance is overdue", it is necessary to correct the "remaining period" to "overdue". In order to solve this technical req...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses a Chinese error correction method based on mutual information. The method comprises the following steps: obtaining a target short sentence to be corrected, performing word segmentation processing on the target short sentence, obtaining a word segmentation sequence, and determining a pinyin combination sequence corresponding to the word segmentation sequence , each pinyin combination in the pinyin combination sequence corresponds to each participle in the word segmentation sequence one by one, and based on the homophone words mapped by each pinyin combination in the pinyin combination sequence, the error correction word sequence set is obtained, and the error correction word sequence set is calculated. The mutual information of each error correction word sequence, according to the size of the mutual information, determines the error correction result of the target short sentence. By applying the technical solution provided by the embodiment of the present invention, the error correction of the target short sentence can be performed according to the frequency of words and the frequency of word collocation, and the accuracy rate of Chinese error correction can be improved. The invention also discloses a mutual information-based Chinese error correction device, a mutual information-based Chinese error correction device and a computer-readable storage medium with corresponding technical effects.

Description

technical field [0001] The invention relates to the technical field of text error correction, in particular to a mutual information-based Chinese error correction method, device, and storage medium. Background technique [0002] With the rapid development of electronic publishing, Chinese automatic proofreading technology has also been greatly developed. Among them, the pinyin error correction technology has played an important role in promoting the development of Chinese automatic proofreading technology. [0003] In recent years, pinyin error correction technology mainly relies on the acquisition of large-scale corpus and the generation of a certain number of corpora, and then based on the corpus, the texts that need to be corrected are matched and compared. Propose the most reasonable and correct word or words according to the frequency of the words. [0004] The existing pinyin error correction may ignore grammatical errors when dealing with word collocation errors, ma...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Patents(China)
IPC IPC(8): G06F40/289G06F40/232G06F40/216
Inventor 何朋罗欢权圣
Owner MASHANG CONSUMER FINANCE CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products