Text normalization method, device and equipment and storage medium

A text and regularization technology, applied in word processing, semantic analysis, etc., can solve problems such as poor text readability, difficulty in understanding the speaker, and difficulty for target users to understand the text, etc.

Pending Publication Date: 2020-02-07
IFLYTEK CO LTD
View PDF10 Cites 3 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0002] In some application scenarios, text may be obtained, and the obtained text may need to be provided to target users for reading. However, due to some reasons, the obtained text may have problems such as poor readability and unclear meaning, which causes target users to hard to read text
[0003] Take the voice recognition scene as an example: voice input is the most natural and convenient way in human-computer interaction. During voice input, due to some reasons (for example, the voice stringing in ne

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Text normalization method, device and equipment and storage medium
  • Text normalization method, device and equipment and storage medium
  • Text normalization method, device and equipment and storage medium

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0087] The following will clearly and completely describe the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.

[0088] In order to achieve text regularization, the inventor of this case conducted research, and the initial idea was:

[0089] Transform text regularization into a binary classification problem, that is, to "delete" or "retain" words in the text to be regularized. The basic process may include: first, perform word segmentation processing on the text to be regularized to obtain a word sequence; then, divide the word sequence Input it into the pre-established ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

The invention provides a text normalization method, device and equipment and a storage medium, and the method comprises the steps: obtaining a to-be-normalized text; extracting text regularization features from the to-be-normalized text, and the text regularization features at least comprising semantic features capable of representing semantics of the to-be-normalized text and generalization features capable of representing repeated parts in the to-be-normalized text; and determining a normalized text corresponding to the to-be- normalized text by utilizing the text normalized features and a pre-established text normalized model. According to the text normalization method provided by the invention, the to-be-normalized text can be normalized into the text with clear sentence meaning and relatively strong readability and logicality by utilizing the text normalization characteristics of the to-be-normalized text and the pre-established text normalization model.

Description

technical field [0001] The present application relates to the technical field of natural language processing, and in particular to a text regularization method, device, equipment and storage medium. Background technique [0002] In some application scenarios, text may be obtained, and the obtained text may need to be provided to target users for reading. However, due to some reasons, the obtained text may have problems such as poor readability and unclear meaning, which causes target users to Difficult to read text. [0003] Take the voice recognition scene as an example: voice input is the most natural and convenient way in human-computer interaction. During voice input, due to some reasons (for example, the voice stringing in near the speaker, the speaker's nervousness or thoughts) Unclearly send out some meaningless modal particles and repetitive words, the speaker will say some online vocabulary, personalized vocabulary, etc. that ordinary people cannot understand due t...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G06F40/10G06F40/30
Inventor 张强
Owner IFLYTEK CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products