Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for detecting voice endpoint

A voice and endpoint technology, applied in voice analysis, voice recognition, instruments, etc., can solve the problems of high false alarm rate, false alarm, and easy occurrence of maximum value of signal-to-noise ratio

Active Publication Date: 2014-04-16
BEIJING BAIDU NETCOM SCI & TECH CO LTD
View PDF6 Cites 17 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] However, the above-mentioned speech endpoint detection method also has certain defects, that is, the false alarm rate is high
Due to the large dynamic range of the local energy minimum point, in some areas, the local minimum energy may approach zero, resulting in the maximum value of the sub-band signal-to-noise ratio of the signal to be detected, which makes the detection result not robust enough and leads to false positives. generation of police

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for detecting voice endpoint
  • Method and device for detecting voice endpoint
  • Method and device for detecting voice endpoint

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0062] figure 1The flow chart of the method for voice signal endpoint detection provided by Embodiment 1 of the present invention, such as figure 1 As shown, the method includes:

[0063] Step 101: Preprocessing the signal to be detected.

[0064] The preprocessing in this step includes: subdividing the signal to be detected into frames, pre-emphasizing, windowing and fast Fourier transform (FFT), etc. The contents of this part are prior art and will not be described in detail here.

[0065] Step 102: Sub-band decomposition is performed on the signal to be detected, and the sub-band energy of each frame is determined and recorded.

[0066] In order to calculate the sub-band SNR of each frame in the signal to be detected, it is necessary to decompose the sub-bands of the input signal to be detected, and calculate the sub-band energy of each frame. When performing subband decomposition, the frequency spectrum of each frame is usually divided into uniform and non-overlapping M...

Embodiment 2

[0103] figure 2 The structural diagram of the device for detecting voice endpoints provided by Embodiment 2 of the present invention, such as figure 2 As shown, the device may include: a preprocessing unit 200 , an energy determination unit 210 , an energy tracking unit 220 , a noise masking unit 230 , a signal-to-noise ratio determination unit 240 and a voice decision unit 250 .

[0104] The preprocessing unit 200 provides the energy determination unit 210 after preprocessing the signal to be detected. The preprocessing includes framing, pre-emphasis, windowing and fast Fourier transform. This unit is an existing unit in the prior art, which is not changed in the present invention.

[0105] The energy determination unit 210 determines and records the energy of each frame of the signal to be detected.

[0106] The energy tracking unit 220 performs minimum energy tracking based on the energy of each frame, that is, tracks the minimum energy for subsequent SNR calculation. ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method and device for detecting a voice endpoint. The method comprises the steps of determining and recording energies of frames of a signal to be detected, tracking the minimum energy based on the energies of the frames, masking noise for the signal to be detected and the tracked minimum energy through the noise masking energy, determining the signal-to-noise ratio of the frames through the signal to be detected and the tracked minimum energy obtained after noise masking, judging voice according to the signal-to-noise ratio of the frames and a preset threshold value, and then determining the voice endpoint. The situation of singular values caused when the partial minimum energy tends to be zero is avoided through the noise masking mode, the false alarm rate is reduced, meanwhile, the dynamic range of the signal-to-noise ratio of a sub-band is narrowed, and the robustness of a detection result is improved.

Description

【Technical field】 [0001] The invention relates to the technical field of speech in computer applications, in particular to a method and device for detecting speech endpoints. 【Background technique】 [0002] In the voice system, the voice signal is often input together with the background noise. How to accurately judge the start and end positions of the voice signal in the input signal becomes the key to suppress and remove the voice noise. The voice endpoint detection technology is such a In this technology, only by accurately determining the endpoint of the voice signal can the voice process be performed correctly. [0003] At present, the voice endpoint detection method based on minimum energy tracking is used, that is, to retain part of the historical information of the voice signal, and use the local minimum sub-band energy tracking technology to find the local minimum value of the sub-band energy, and use this local minimum energy value as the background noise. Referen...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/04
Inventor 宋辉关勇贾磊
Owner BEIJING BAIDU NETCOM SCI & TECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products