Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Wrong word correction method, device, computer device and storage medium

A wrong word and pinyin technology, applied in the field of devices, computer devices and computer storage media, and wrong word correction methods, can solve problems such as the recognition of proprietary words as common words, the lack of effective language recognition and correction effects, and the difficulty in finding them.

Active Publication Date: 2022-02-15
PING AN TECH (SHENZHEN) CO LTD
View PDF6 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For some companies that develop products with speech recognition functions, it is more common to use the speech recognition module of the general-purpose system. If they do not recognize their specific application scenarios, it is easy to recognize some proprietary words as common words.
For example, "Who needs to be insured" is recognized as "Who needs to be Taobao". Since there is no obvious error, it is difficult for the existing typo correction system to find such errors
[0003] At present, there is no effective solution for how to improve the correction effect of language recognition in practical application scenarios

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Wrong word correction method, device, computer device and storage medium
  • Wrong word correction method, device, computer device and storage medium
  • Wrong word correction method, device, computer device and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0063] figure 1 It is a flow chart of the method for correcting wrong words provided by Embodiment 1 of the present invention. The method for correcting wrong words is applied to a computer device.

[0064] The method for correcting wrong words of the present invention is to correct the sentences obtained by language recognition. The wrong word correction method can solve the problem that due to the versatility of the speech recognition system, it cannot accurately predict the proprietary words in a specific field, and at the same time enhances the error correction system's ability to find wrong words when the proprietary words are replaced with common words, Improve user experience.

[0065] like figure 1 Shown, described wrong word correcting method comprises:

[0066] Step 101, acquire a general natural language data set, the general natural language data set includes multiple sentences.

[0067] The general-purpose natural language dataset is a Chinese text containing...

Embodiment 2

[0111] figure 2 It is a structural diagram of the device for correcting wrong words provided by Embodiment 2 of the present invention. The wrong word correcting device 20 is applied to a computer device. like figure 2 As shown, the device 20 for correcting wrong words may include a first acquisition module 201 , a conversion module 202 , a generation module 203 , a pre-training module 204 , a second acquisition module 205 , a fine-tuning module 206 , and an error correction module 207 .

[0112] The first acquiring module 201 is configured to acquire a general natural language data set, and the general natural language data set includes a plurality of sentences.

[0113] The general-purpose natural language dataset is a Chinese text containing everyday expressions.

[0114] The general natural language data set can be collected from data sources such as books, news, web pages (such as Baidu Encyclopedia, Wikipedia, etc.). For example, character recognition can be perform...

Embodiment 3

[0158] This embodiment provides a computer storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps in the above embodiment of the method for correcting wrong words are implemented, for example figure 1 Steps 101-107 shown:

[0159] Step 101, obtaining a general natural language data set, the general natural language data set includes a plurality of sentences;

[0160] Step 102, converting each sentence contained in the general natural language dataset into a pinyin sequence to obtain a pinyin-sentence pair of the general natural language dataset;

[0161] Step 103, select a plurality of pinyin-sentence pairs from the pinyin-sentence pairs of the general natural language data set, replace part of the pinyin of each selected pinyin-sentence pair with similar pinyin, and obtain the replaced pinyin-sentence pair , forming a first sample set from the unselected pinyin-sentence pairs and the replaced pinyin-sentence...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a wrong word correction method, device, computer device and storage medium. The method for correcting wrong words comprises: obtaining a general natural language data set; converting each sentence contained in the natural language data set into a pinyin sequence to obtain the pinyin-sentence pair of the general natural language data set; converting part of the general natural language data set Pinyin-sentence pairs are replaced by pinyin to obtain the first sample set; the neural network model is pre-trained by using the first sample set to obtain the pre-trained neural network model; The pinyin-sentence pair is used as the second sample set; the pre-trained neural network model is fine-tuned by the second sample set to obtain the fine-tuned neural network model; the pinyin sequence of the sentence to be corrected is input into the fine-tuned neural network model for Error correction, get the sentence after error correction. The invention can correct the error that the special word is recognized as a common word in language recognition.

Description

technical field [0001] The invention relates to the technical field of speech recognition, in particular to a method and device for correcting wrong words, a computer device and a computer storage medium. Background technique [0002] With the rapid expansion of speech recognition application scenarios, speech recognition technology is becoming more and more mature, and the market demand for high-accuracy speech recognition is becoming stronger and stronger. For some companies that develop products with speech recognition functions, it is more common to use the speech recognition module of the general system, and do not recognize specific application scenarios, and it is easy to recognize some proprietary words as common words. For example, "who needs to be insured" is identified as "who needs to be Taobao". Since there are no obvious mistakes, it is difficult for the existing typo correction system to find such mistakes. [0003] At present, there is no effective solution ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F16/33G06F40/279G06F40/232G06N3/04G10L25/30G10L15/08
CPCG06F16/3343G06F16/3344G10L25/30G10L15/08G10L2015/088G06F40/232G06F40/279G06N3/045
Inventor 解笑徐国强邱寒
Owner PING AN TECH (SHENZHEN) CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products