Neural network training and voice endpoint detection method and device

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A neural network and training method technology, applied in the field of voice endpoint detection, can solve the problems of inaccurate cutting of voice segments, high delay, false triggering, etc., and achieve the effect of accurate results

Active Publication Date: 2020-06-19

AISPEECH CO LTD

View PDF5 Cites 7 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] In the process of implementing this application, the inventor found that the existing solutions have at least the following defects: 1. High delay, which affects user experience; 2. No voice is detected, and the voice segment is rejected; 3. Mis-triggered, non-voice segment Detected as voice; 4. The voice segment is not cut correctly, the beginning of the voice segment is cut, and the end is cut

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment approach

[0084] As an implementation manner, the above-mentioned electronic equipment is applied to a neural network training device, including:

[0085] at least one processor; and a memory communicatively connected to the at least one processor; wherein the memory stores instructions executable by the at least one processor, the instructions are executed by the at least one processor so that the at least one processor can:

[0086] randomly mixing speech audio data and non-speech audio data to form mixed audio data;

[0087] extracting acoustic features of the mixed audio data;

[0088] The acoustic feature is input in the FSMN model, and the training of the FSMN model makes the classification of the output of the FSMN model substantially equal to the speech audio data and the non-speech audio data in the mixed audio data Classification.

[0089] As another implementation manner, the above-mentioned electronic equipment is applied to a voice endpoint detection device, including:

...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a neural network training and voice endpoint detection method and device. The method comprises the steps: randomly mixing voice audio data and non-voice audio data to form mixed audio data; extracting acoustic features of the mixed audio data; and inputting the acoustic features into an FSMN model, and training the FSMN model to enable the classification of the voice audiodata and the non-voice audio data output by the FSMN model to be basically equal to the classification of the voice audio data and the non-voice audio data in the mixed audio data. According to the method, the non-voice audio data and the voice audio data are mixed and then the mixed data are used for inputting of a feedforward sequence memory neural network to train the neural network, so that the neural network can output information that whether the audio data units belong to voice audio data or non-voice audio data and then the information can be used for voice endpoint detection, and theresult of voice endpoint detection is more accurate.

Description

technical field [0001] The invention belongs to the technical field of voice endpoint detection, and in particular relates to neural network training and a voice endpoint detection method and device. Background technique [0002] In related technologies, Voice Activity Detection (VAD) is also called voice endpoint detection and voice boundary detection. It is used to detect whether there is a speech segment in the continuous audio stream data. [0003] like figure 1 As shown, the start (T1) and end (T2) time of the voice segment is calculated in real time. In order to ensure the effect of subsequent voice recognition or voice wake-up, the start time will be advanced and the end time will be delayed, and finally two time points T0 and T3 will be output. [0004] In the process of implementing this application, the inventor found that the existing solutions have at least the following defects: 1. High delay, which affects user experience; 2. No voice is detected, and the voi...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/04G10L15/06G10L15/08G10L25/30

CPCG10L15/04G10L15/063G10L15/08G10L25/30

Inventor 胡雪成

Owner AISPEECH CO LTD

Who we serve

R&D Engineer
R&D Manager
IP Professional

Why Patsnap Eureka

Industry Leading Data Capabilities
Powerful AI technology
Patent DNA Extraction

Social media

Patsnap Eureka Blog

Learn More

PatSnap group products

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Neural network training and voice endpoint detection method and device

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment approach

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology