Method and system of construction of voice corpus

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A corpus and voice technology, which is applied in the construction method and system field of the voice corpus, can solve the problems that the voice corpus cannot take into account low cost and high recognition rate, high acquisition cost, high recognition rate, etc., so as to reduce acquisition cost, improve recognition efficiency, The effect of short training times

Active Publication Date: 2013-07-10

CENTRIN DATA SYST

View PDF5 Cites 26 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0006] For this reason, the first one of the present invention to be solved is the high technical problem of the collection cost of the existing speech corpus collection method, and a method and system for making full use of the speech corpus of the existing Internet are provided

[0007] The second object of the present invention is to solve the technical problem that the existing voice corpus that relies entirely on speech construction and the actual scene voice corpus that is completely based on actual scenes cannot take into account both low cost and high recognition rate, and provides a low-cost and high-recognition rate Method and system for building high-quality speech corpus

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0046] see figure 1 Shown is a speech corpus construction system according to an embodiment of the present invention, which includes: a speech input client, an annotation client and a server.

[0047] Wherein, the voice input client further includes: a voice collection device, which collects the voice entered by the user as the basic voice corpus, and transmits the collected basic voice corpus to the network sending device. As a specific embodiment, the voice collection The device is a microphone, and of course as other implementations, the sound collection device can be any device that can realize sound collection; the network sending device receives the basic speech corpus collected by the sound collection device and transmits the basic speech corpus through the network to the server;

[0048] The actual scene speech corpus acquisition device is used to collect the speech corpus generated in the actual application scene, and recognize the collected actual scene speech corpu...

Embodiment 2

[0063] see figure 2 , based on the same inventive concept, the present invention also provides a method for constructing a speech corpus, comprising the following steps,

[0064] S01: the sound collection device enters the voice information to form the basic voice corpus and transmits it to the network sending device;

[0065] S02: The network sending device sends the basic voice corpus received from the voice input client to the server;

[0066] S03: The server receives the basic speech corpus sent by the network sending device and stores it in a corpus.

[0067] see image 3 , the construction method of speech corpus of the present invention also comprises the following steps:

[0068] S'01: Collect actual scene speech corpus, recognize the collected actual scene speech corpus, and transmit the actual scene speech corpus and recognition results to the temporary corpus of the server.

[0069] S'02: Online labeling the actual scene speech corpus stored in the temporary co...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a method and a system of construction of a voice corpus. Collection of voice linguistic data is achieved through a voice entering client-side, then the linguistic data are transmitted to a server through a network, the collection of the voice linguistic data can be carried out anytime and anywhere, a special recording studio and special voice recording equipment are not needed, the collection of the voice linguistic data can be achieved by using an existing network, and therefore collection cost of the voice linguistic data is greatly reduced. Meanwhile, the voice linguistic data are used for follow-up voice recognition, recognized voices are all generated in a daily living environment, records to be recognized naturally have noise of surrounding environments, and when the voice linguistic data are only generated in the recording studio, the voice linguistic data break away from a real life, and are not beneficial for recognition of voices in scenes of the real life. The method and the system of construction of the voice corpus enable the voice linguistic data to be close to the voices in the scenes of the real life on the basis of cost reduction, and improve a recognition ratio of the voices in the real scenes.

Description

technical field [0001] The invention relates to a voice recognition method and system, in particular to a voice corpus construction method and system. Background technique [0002] The development of speech recognition technology has a history of more than 40 years, and has achieved remarkable progress, and has been popularized and applied in some enterprise systems. However, due to the impact of recognition accuracy, the application of speech recognition in a wider range of applications is greatly limited. [0003] Speech recognition is an application of artificial intelligence and machine learning tasks. Among them, machine learning tasks are generally divided into two processes: training and prediction: the training process summarizes known samples to form a model; the prediction process uses the model to analyze unknown samples for prediction. Then the predicted results will depend on the perfection and accuracy of the model. Machine learning tasks conform to the Baye...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/06G10L15/30

Inventor 江南陈德全

Owner CENTRIN DATA SYST

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Method and system of construction of voice corpus

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

Embodiment 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology