Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Voice awakening optimization method based on cascade DNN

A voice wake-up and optimization method technology, applied in voice analysis, instruments, etc., can solve problems such as poor anti-noise ability and complex models, and achieve the effects of low false wake-up rate, strong environmental adaptability, and good robustness

Inactive Publication Date: 2019-06-14
武汉水象电子科技有限公司
View PDF13 Cites 14 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0013] The technical problem to be solved by the present invention is to overcome the shortcomings of the voice wake-up method model in the prior art that the model is relatively complicated and the anti-noise ability is poor, and provide a voice wake-up optimization method based on cascaded DNN

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Voice awakening optimization method based on cascade DNN
  • Voice awakening optimization method based on cascade DNN

Examples

Experimental program
Comparison scheme
Effect test

Embodiment

[0042] Such as Figure 1-2 Shown, a kind of voice wake-up optimization method based on cascade DNN, comprises the following steps:

[0043] 1) Acquire the voice signal collected by the microphone in real time, and obtain the frame-by-frame acoustic features of the real-time voice signal through feature extraction; feature extraction refers to the MFCC (Mel Frequency Cepstral Coefficients) feature extraction of real-time voice, a total of 14 dimensions, the 14th dimension is the current the logarithmic energy of the frame;

[0044] 2) With a fixed window length, the acoustic feature sequence is intercepted to form a frame, which is used as the input of the first-level DNN;

[0045] 3) After the forward process calculation of the first-level DNN acoustic model, the output is the acoustic posterior probability of the frame-by-frame phoneme; the specific method is as follows:

[0046] a) Deform the frame to a dimension of 1 to form a 1-dimensional feature sequence;

[0047] b) ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a voice awakening optimization method based on cascade DNN. The method comprises the steps that 1, a voice signal is acquired by a microphone in real time, and feature extraction is conducted to obtain frame-by-frame acoustic features of the real-time voice signal; 2, an acoustic feature sequence is intercepted according to a fixed window length to form a frame serving as first-stage DNN input; 3, forward process calculation of a first-stage DNN acoustic model is carried out, and the acoustic posterior probability of frame-by-frame phonemes is obtained through output; 4, first-stage DNN output is intercepted according to the fixed window length to form a one-frame phoneme posterior probability sequence to serve as second-stage DNN input; 5, second-stage DNN forwardprocess calculation is carried out for judgment, and whether awakening is carried out or not is output. The anti-noise capability of the DNN can be utilized to the maximum extent, the environmental adaptability is high, and the situation that a VAD is manufactured first and then awakening detection is carried out is not needed; a voice background does not need to be independently modeled; the twostages of models can be complementary, so that corpora required by training can be greatly reduced; no language model is generated, and text corpora are not needed.

Description

technical field [0001] The invention relates to a voice wake-up optimization method based on cascaded DNN. Background technique [0002] Speech, as the most common and effective way of human-human interaction, has always been an important part of the research field of human-computer communication and human-computer interaction. Human-computer voice interaction technology, which is composed of speech synthesis, speech recognition and natural language understanding, is recognized as a difficult and challenging technical field in the world. [0003] Automatic speech recognition is a key link in the human-computer intelligent interaction technology. The problem it wants to solve is to enable the computer to "understand" human speech and "strip" the text information contained in the speech signal. Technology is equivalent to installing human-like "ears" on computers, and plays a vital role in intelligent computer systems that can "hear and speak". Speech recognition is a multid...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L17/04G10L17/24G10L17/18
Inventor 赵升
Owner 武汉水象电子科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products