Text normalization method, device and apparatus and readable storage medium

A text and regularization technology, applied in special data processing applications, instruments, electrical digital data processing, etc., can solve the problems of lengthy and cumbersome numbers, affecting reading, and difficulty in quickly grasping key information of texts, etc., to achieve good regularization effect and easy implementation Effect

Active Publication Date: 2019-03-08
IFLYTEK CO LTD
View PDF4 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] Speech recognition technology refers to the recognition of audio as text. In many cases, the recognized text will contain numbers. These numbers are usually expressed in Chinese characters, such as one, two, three, four, five, etc., in Chinese characters Indicates that the number is tedious and cumbersome, which greatly affects reading, and it is difficult to quickly grasp the key information of the text. For example, the recognized text includes "Your mobile phone number is 13956 143260, as of 2018 At 18:32 on June 20, the unbilled phone bill was 204.14 yuan." In order to facilitate users to read and quickly read and grasp the key information of the text, there is an urgent need for a reasonable A text regularization scheme that efficiently converts digital-related Chinese characters in the text into Arabic numerals or special symbols, so as to obtain text data that is easy to read and understand

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text normalization method, device and apparatus and readable storage medium
  • Text normalization method, device and apparatus and readable storage medium
  • Text normalization method, device and apparatus and readable storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0049]The following will clearly and completely describe the technical solutions in the embodiments of the application with reference to the drawings in the embodiments of the application. Apparently, the described embodiments are only some of the embodiments of the application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0050] In order to be able to reasonably convert the Chinese characters related to numbers in the text into Arabic numerals or special symbols, the inventor of this case conducted in-depth research:

[0051] The idea at the initial stage is to use offline definition grammar rules to match texts that need to be regularized in different scenarios, such as numerical values, phone numbers, dates, times, mathematics, race scores, license plate numbers, file numbers, addresses, idioms, and ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The application provides a text normalization method, device and apparatus and a readable storage medium. The method includes obtaining the text to be normalized; processing the text content of the text to be regularized into a plurality of text units to obtain a preprocessed text, wherein one text unit in the preprocessed text is a word or a word; based on the normalized category information corresponding to each text unit in the preprocessed text, normalizing the text units to be normalized in the text to be normalized to obtain the normalized text. The text regularization method provided bythe present application can regularize the numeral-related Chinese characters into Arabic numerals or special symbols, thereby obtaining text data that is convenient for users to read and understand.The text regularization method provided by the present application is easy to realize and has good regularization effect.

Description

technical field [0001] The present application relates to the technical field of speech recognition, and in particular to a text regularization method, device, equipment and readable storage medium. Background technique [0002] Speech recognition technology refers to the recognition of audio as text. In many cases, the recognized text will contain numbers. These numbers are usually expressed in Chinese characters, such as one, two, three, four, five, etc., in Chinese characters Indicates that the number is tedious and cumbersome, which greatly affects reading, and it is difficult to quickly grasp the key information of the text. For example, the recognized text includes "Your mobile phone number is 13956 143260, as of 2018 At 18:32 on June 20, the unbilled phone bill was 204.14 yuan." In order to facilitate users to read and quickly read and grasp the key information of the text, there is an urgent need for a reasonable It is a text regularization scheme that converts numb...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
Patent Type & Authority Applications(China)
IPC IPC(8): G06F17/25G06F40/189
CPCG06F40/189Y02D10/00
Inventor 戚婷高建清孔常青王智国
Owner IFLYTEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products