Error correction method and device for Chinese text in power field, storage medium and computing equipment

A text error correction and field technology, applied in computing, electrical digital data processing, data processing applications, etc., can solve the problems of time-consuming and labor-intensive generalization, limited and low semantic representation effect of statistical language models, and achieve the improvement effect. Effect

Pending Publication Date: 2022-03-01
STATE GRID JIANGSU ELECTRIC POWER CO ELECTRIC POWER RES INST +3
View PDF0 Cites 5 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, due to the limitation of the semantic representation effect of the statistical language model, it is necessary to filter candidate characters through a large number of rules.
This method is not only time-consuming and labor-intensive, but also has low generalization

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Error correction method and device for Chinese text in power field, storage medium and computing equipment
  • Error correction method and device for Chinese text in power field, storage medium and computing equipment
  • Error correction method and device for Chinese text in power field, storage medium and computing equipment

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0065] The present invention will be further described below. The following examples are only used to illustrate the technical solution of the present invention more clearly, but not to limit the protection scope of the present invention.

[0066] The present invention provides a Chinese text error correction method in the electric power field, which is implemented based on a pre-trained language model, including:

[0067] Input the sentences in the Chinese text in the power field that need to be corrected into the trained PLOME pre-training language model in the power field, predict each word, and predict the occurrence probability of each word in the predefined vocabulary;

[0068] Filter the occurrence probability of each word to get the semantic candidate set of each word in the sentence;

[0069] Input the same sentence into the pinyin confusion dictionary, grapheme confusion dictionary and electric field custom confusion dictionary respectively, and obtain the pinyin co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention discloses an electric power field Chinese text error correction method and device, a storage medium and computing equipment, and the method comprises the steps: inputting a sentence in an electric power field Chinese text needing error correction into a trained electric power field pre-training language model, and obtaining a prediction character sequence of each word in the sentence; screening the predicted character sequence of each character to obtain a semantic candidate set of each character in the sentence; respectively inputting the same sentence into a pinyin confusion dictionary, a font confusion dictionary and a power field user-defined confusion dictionary to obtain a pinyin confusion set, a font confusion set and a user-defined confusion set of each character in the sentence; and performing error correction on characters in the sentence based on the semantic candidate set, the pinyin confusion set, the font confusion set and the user-defined confusion set. According to the method, the pre-training language model is adopted to replace a statistical language model, the text error correction scheme for the power industry is constructed, and the text error correction effect can be effectively improved.

Description

technical field [0001] The invention discloses a Chinese text error correction method, device, storage medium and computing equipment in the electric power field, and belongs to the technical field of language processing in the electric power field. Background technique [0002] Chinese text error correction is to detect and correct Chinese text errors. Chinese text error correction technology is the underlying core technology in the field of natural language processing. It is widely used in different business scenarios such as intelligent dialogue, search engine, and assistant creation, and has been widely concerned by the industry. [0003] Chinese text spelling errors are mainly divided into pinyin errors and font errors. As the information construction of the power industry becomes more and more perfect, various text data are gradually increasing. Constructing a text error correction model in line with the electric power field can effectively improve the effect of inte...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/232G06F40/30G06Q50/06
CPCG06F40/232G06F40/30G06Q50/06
Inventor 刘子全杨景刚胡成博王真朱雪琼高山马径坦刘咏飞赵科路永玲
Owner STATE GRID JIANGSU ELECTRIC POWER CO ELECTRIC POWER RES INST
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products