Voice end detection method based on energy and harmonic

An endpoint detection and energy detection technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problems of sensitivity to sudden changes in noise energy, low signal-to-noise ratio, and difficulty in tracking noise energy.

Inactive Publication Date: 2007-02-14
INST OF ACOUSTICS CHINESE ACAD OF SCI +1
View PDF0 Cites 61 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

For example, Sahar E. Bou-Ghazale et al. proposed in 2002 the method of using energy superposition relationship and cepstrum feature for endpoint detection. This method can work stably in a stable noise environment, but this method is very sensitive to noise energy mutations, and Since the cepstral feature of speech is affected by noise, it cannot be used in the case of low signal-to-noise ratio
Another example is the endpoint detection method proposed by Arnaud Martin in 2003 that

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice end detection method based on energy and harmonic
  • Voice end detection method based on energy and harmonic
  • Voice end detection method based on energy and harmonic

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0035] The speech endpoint detection method based on energy and harmonics provided by the present invention includes three basic steps: energy detection, harmonic detection, and speech endpoint detection. The input signal is a digitized sound signal, which is divided into frames of equal length (each frame signal contains F sampling points, and the time length is about 25 milliseconds) and overlaps each other for about 15 milliseconds, and uses L frame buffer (L>25, L frame The duration of the signal is greater than 200 milliseconds). The working process of each step of the present invention will be described in detail below.

[0036] like figure 1 Shown, the present invention comprises the steps:

[0037] Step 100: Set the L frame buffer, store the input signal of the previous L frames into the buffer, and start the endpoint detection. Every time a frame signal is input, the buffer is automatically shifted and updated.

[0038] Step 200: Preliminarily detect the starting ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

PUM

No PUM Login to view more

Abstract

This invention relates to a phone end point test method based on energies and harmonic waves including the following steps: pre-processing digitalized sound signals and storing them in a section of buffer storage, shifting and updating the buffer storage and regulating the threshold value every time inputting a signal to judge the Tstart based on the energy, then searching for signals with sonant harmonic wave character from the buffer storage, if sonant is found, then the accurate Tstart is searched, on the contrary, the phone Tstart is searched based on the energy then to search for the phone terminal point according to the signal energy. Advantages: this invention can adjust accuracy based on noise strength so it is adaptive to the S/N ratio of input signals and energy test sphere is rather wide so phones with weak energy will not be omitted.

Description

technical field [0001] The invention relates to the field of automatic speech recognition, in particular to a speech endpoint detection method. Background technique [0002] The input signal of an automatic speech recognition system is usually speech with noise. In order to prevent the signal segment without speech from entering the recognizer, to ensure system performance and reduce computing overhead, it is necessary to detect the start and end of the user's speech in the signal. This The process is called endpoint detection. [0003] Commonly used endpoint detection algorithms can be divided into rule-based and model-based. Rule-based methods generally use signal energy, zero-crossing rate, cepstrum, long-term spectrum estimation and other features to calculate the distance, and determine whether the voice exists by comparing the distance with the threshold and logical operations. Model-based methods generally build models for the statistical characteristics of noise an...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to view more

Application Information

Patent Timeline
no application Login to view more
IPC IPC(8): G10L15/00G10L15/04G10L11/00G10L21/0216G10L25/87G10L25/93
Inventor 国雁萌付强
Owner INST OF ACOUSTICS CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Try Eureka
PatSnap group products