Wrong word correction method, device, computer device and storage medium
A wrong word and pinyin technology, applied in the field of devices, computer devices and computer storage media, and wrong word correction methods, can solve problems such as the recognition of proprietary words as common words, the lack of effective language recognition and correction effects, and the difficulty in finding them.
- Summary
- Abstract
- Description
- Claims
- Application Information
AI Technical Summary
Problems solved by technology
Method used
Image
Examples
Embodiment 1
[0063] figure 1 It is a flow chart of the method for correcting wrong words provided by Embodiment 1 of the present invention. The method for correcting wrong words is applied to a computer device.
[0064] The method for correcting wrong words of the present invention is to correct the sentences obtained by language recognition. The wrong word correction method can solve the problem that due to the versatility of the speech recognition system, it cannot accurately predict the proprietary words in a specific field, and at the same time enhances the error correction system's ability to find wrong words when the proprietary words are replaced with common words, Improve user experience.
[0065] like figure 1 Shown, described wrong word correcting method comprises:
[0066] Step 101, acquire a general natural language data set, the general natural language data set includes multiple sentences.
[0067] The general-purpose natural language dataset is a Chinese text containing...
Embodiment 2
[0111] figure 2 It is a structural diagram of the device for correcting wrong words provided by Embodiment 2 of the present invention. The wrong word correcting device 20 is applied to a computer device. like figure 2 As shown, the device 20 for correcting wrong words may include a first acquisition module 201 , a conversion module 202 , a generation module 203 , a pre-training module 204 , a second acquisition module 205 , a fine-tuning module 206 , and an error correction module 207 .
[0112] The first acquiring module 201 is configured to acquire a general natural language data set, and the general natural language data set includes a plurality of sentences.
[0113] The general-purpose natural language dataset is a Chinese text containing everyday expressions.
[0114] The general natural language data set can be collected from data sources such as books, news, web pages (such as Baidu Encyclopedia, Wikipedia, etc.). For example, character recognition can be perform...
Embodiment 3
[0158] This embodiment provides a computer storage medium, on which a computer program is stored, and when the computer program is executed by a processor, the steps in the above embodiment of the method for correcting wrong words are implemented, for example figure 1 Steps 101-107 shown:
[0159] Step 101, obtaining a general natural language data set, the general natural language data set includes a plurality of sentences;
[0160] Step 102, converting each sentence contained in the general natural language dataset into a pinyin sequence to obtain a pinyin-sentence pair of the general natural language dataset;
[0161] Step 103, select a plurality of pinyin-sentence pairs from the pinyin-sentence pairs of the general natural language data set, replace part of the pinyin of each selected pinyin-sentence pair with similar pinyin, and obtain the replaced pinyin-sentence pair , forming a first sample set from the unselected pinyin-sentence pairs and the replaced pinyin-sentence...
PUM
Abstract
Description
Claims
Application Information
- R&D Engineer
- R&D Manager
- IP Professional
- Industry Leading Data Capabilities
- Powerful AI technology
- Patent DNA Extraction
Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.
© 2024 PatSnap. All rights reserved.Legal|Privacy policy|Modern Slavery Act Transparency Statement|Sitemap|About US| Contact US: help@patsnap.com