Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Text error correction method and device, equipment and storage medium

A text error correction and text sequence technology, which is applied to instruments, digital data processing, computing, etc., can solve problems such as class overfitting and rude determination of word similarity in Bert model, so as to improve the accuracy and avoid excessive error correction. Effect

Pending Publication Date: 2021-11-16
PING AN TECH (SHENZHEN) CO LTD
View PDF0 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The present invention provides a text error correction method, device, equipment and storage medium, which can solve the problem that the Bert model is too rough for word similarity judgment and leads to class overfitting

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text error correction method and device, equipment and storage medium
  • Text error correction method and device, equipment and storage medium
  • Text error correction method and device, equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0046] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0047]The terms "first", "second", and "third" in the present invention are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, features defined as "first", "second", and "third" may explicitly or implicitly include at least one of these features. In the description of the present invention, "plurality" means at least two, such as...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention relates to the technical field of artificial intelligence, and discloses a text error correction method and device, equipment and a storage medium, and the method comprises the steps: obtaining a to-be-corrected text sequence; inputting the to-be-corrected text sequence into the Bert model, identifying wrongly written characters in the to-be-corrected text sequence to obtain a wrongly written character set, and correcting the wrongly written character set based on a preset candidate character set to obtain a corrected target text sequence; sequentially extracting first target characters after error correction from the target text sequence, obtaining second target characters before error correction corresponding to the first target characters, and calculating font similarity and character similarity between the first target characters and the second target characters; and calculating an error correction judgment factor according to the font similarity and the character similarity, comparing the error correction judgment factor with a preset threshold value, and determining an error correction result of the text sequence to be subjected to error correction according to a comparison result. By means of the mode, the problem that class overfitting is caused due to the fact that a Bert model is too rough in word similarity judgment can be solved.

Description

technical field [0001] The invention relates to the technical field of natural language of artificial intelligence, in particular to a text error correction method, device, equipment and storage medium. Background technique [0002] Text error correction is a technology in the field of natural language processing to detect whether there are typos in a text and to correct the typos. Text error correction is generally used in the text preprocessing stage, and it is also widely used in the problem of inaccurate speech recognition. At present, the common problems of text error correction in the industry are as follows: ①more correction, ②less correction, and ③miscorrection. The causes of these problems include glyph splitting, distance calculation, etc. In terms of language statistical models, the language statistical models including ngram model and Bert model are widely used at present. Among them, the judgment result of Bert model for word similarity is 0, 1. This judgment m...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G06F40/232G06F40/284
CPCG06F40/232G06F40/284Y02D10/00
Inventor 谷坤
Owner PING AN TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products