Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A method for screening and optimizing audio keyword templates

An optimization method and keyword technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as non-distinguishing audio clips, reduced system retrieval performance, and channel mismatch, to achieve good retrieval results and improve input adaptability performance, good retrieval performance

Active Publication Date: 2020-04-03
INST OF ACOUSTICS CHINESE ACAD OF SCI +1
View PDF3 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] However, in practical tasks, different audio clips of keywords often have large differences in quality, which may come from factors such as noise, channel mismatch, labeling errors, etc.
Such audio clips may not be sufficiently discriminative, and thus may lead to poor retrieval performance of the system if directly introduced into the keyword retrieval process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A method for screening and optimizing audio keyword templates
  • A method for screening and optimizing audio keyword templates
  • A method for screening and optimizing audio keyword templates

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0029] The method of the invention is applied to the front end of the voice keyword retrieval system based on the audio template. First, the speech sample template of the keyword retrieval system is converted into a sequence of probability distribution through the front end of the acoustic model, and then the stability of the probability distribution inside the sequence and the similarity between the sequences are calculated. Based on this, the quality of each template can be evaluated. Further, according to the quality evaluation standard, several most representative templates are selected, and the probability distribution of these templates is adjusted to obtain a new template with higher quality than the original template. These templates will be used as keyword templates for the subsequent retrieval process.

[0030] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0031] Such as figure 1 Sho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for screening and optimizing an audio-frequency keyword template. The method comprises the steps of step1, extracting features of each audio-frequency keyword template sample, making extracted features through a deep neural network and calculating posterior probabilities of all phonemes on a given phoneme set; step2, calculating posterior probability stability scores, pronunciation reliability scores and neighborhood similarity scores of the template; step3, calculating a weighted average of the above three scores of each audio-frequency keyword template and denoting the weighted average as an average score; step4, sequencing according to the order of average scores from large to small and selecting the first L audio-frequency keyword templates as representative pronunciation templates; step5, processing each representative pronunciation template, adjusting the posterior probability of each pronunciation unit on each frame of the pronunciation sequence and minimizing the neighborhood similarity scores of the template; generating optimized L audio-frequency retrieval word templates.

Description

technical field [0001] The invention belongs to the field of speech recognition, and in particular relates to a method for screening and optimizing audio keyword templates. Background technique [0002] The keyword retrieval task is to quickly find the location of a given keyword from large-scale and diverse speech data. In the keyword retrieval task based on speech clips, keywords to be retrieved are given in the form of a set of audio clip templates. These fragments are usually from different speakers or extracted from different contexts, and thus differ in the information they contain. In order to obtain retrieval results with better generalization, that is, in order to be able to process keywords from different speakers or with different contexts in the speech to be retrieved, it is necessary to make full use of as many audio clips as possible of a certain keyword. The traditional approach is to average all the templates belonging to a single keyword to obtain a single...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/02G10L15/26
CPCG10L15/02G10L15/26
Inventor 徐及张舸潘接林颜永红
Owner INST OF ACOUSTICS CHINESE ACAD OF SCI
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products