Text error correction method and system, computer equipment and readable storage medium

A text error correction and text technology, applied in computing, computing models, machine learning, etc., can solve problems such as high labor cost of typo dictionary, unreasonable typo output, and weak semantic information, so as to improve the effect and performance, and improve error correction Accuracy, improve the effect of error correction

Pending Publication Date: 2021-01-29
BEIJING MININGLAMP SOFTWARE SYST CO LTD
View PDF0 Cites 16 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] Among the above-mentioned tools, the labor cost of constructing a typo dictionary is relatively high, and it is only applicable to some vertical fields with limited typos; the method of edit distance is not universal; due to the relatively weak semantic information of "word granularity" in the language model, its error The judgment rate will be higher than the error correction of "word granularity"; "word granularity" is more dependent on the accuracy of the word segmentation model. In order to reduce the misjudgment rate, CRF layer proofreading is often added to the output layer of the model. By learning transfer probability and global Optimal path to avoid unreasonable typo output
The BERT method is too rough, and it is easy to cause high misjudgment rate. The mask position of BERT is randomly selected, so it is not good at detecting the error position in the sentence, and BERT error correction does not consider constraints, resulting in low accuracy

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text error correction method and system, computer equipment and readable storage medium
  • Text error correction method and system, computer equipment and readable storage medium
  • Text error correction method and system, computer equipment and readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0045] In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be described and illustrated below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, and are not intended to limit the present application. Based on the embodiments provided in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of this application.

[0046] Obviously, the accompanying drawings in the following description are only some examples or embodiments of the present application, and those skilled in the art can also apply the present application to other similar scenarios. In addition, it can also be understood that although such development efforts may be complex and lengthy, for those of ordinary skill in th...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention relates to a text error correction method and system, computer equipment and a computer readable storage medium, and the method comprises the steps: a data obtaining step: obtaining to-be-corrected text data; a negative sample construction step for creating a confusion word table and performing corpus replacement on the text to be corrected according to the confusion word table to generate a negative sample; a text error correction step used for taking the text data to be corrected and the negative sample data as training data, respectively training Chinese character features andpinyin features of the training data through a SoftMasked BERT pre-training model, splicing the Chinese character features and the pinyin features into a training result, and calculating cross entropy loss of the training result through a Softmax layer to obtain an error correction result; a model optimization step, wherein the SoftMasked BERT pre-training model is optimized through recursive prediction and word list filtering. Through the method, the text error correction accuracy is effectively improved, and the model effect and performance are improved.

Description

technical field [0001] The present application relates to the field of natural language processing, in particular to a text error correction method, system, computer equipment and computer-readable storage medium. Background technique [0002] Chinese error correction technology is an important technology to realize the automatic checking and automatic error correction of Chinese sentences. Its purpose is to improve the correctness of the language and reduce the cost of manual verification. Chinese error correction technology mainly corrects errors based on the similarity of fonts. It is a technology in the field of natural language processing to detect whether there are typos in a text and to correct them. Obtaining inaccurate questions, such as typos in some intelligent question-and-answer scenarios, will affect query understanding and dialogue effects. In the general field, the problem of Chinese text error correction has been seeking to solve since the beginning of the ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/232G06F40/216G06F40/284G06N20/00
CPCG06F40/232G06F40/216G06F40/284G06N20/00
Inventor 陈倩倩景艳山郑悦
Owner BEIJING MININGLAMP SOFTWARE SYST CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products