A method for screening and optimizing audio keyword templates

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
An optimization method and keyword technology, applied in speech analysis, speech recognition, instruments, etc., can solve problems such as non-distinguishing audio clips, reduced system retrieval performance, and channel mismatch, to achieve good retrieval results and improve input adaptability performance, good retrieval performance

Active Publication Date: 2020-04-03

INST OF ACOUSTICS CHINESE ACAD OF SCI +1

View PDF3 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0003] However, in practical tasks, different audio clips of keywords often have large differences in quality, which may come from factors such as noise, channel mismatch, labeling errors, etc.

Such audio clips may not be sufficiently discriminative, and thus may lead to poor retrieval performance of the system if directly introduced into the keyword retrieval process

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0029] The method of the invention is applied to the front end of the voice keyword retrieval system based on the audio template. First, the speech sample template of the keyword retrieval system is converted into a sequence of probability distribution through the front end of the acoustic model, and then the stability of the probability distribution inside the sequence and the similarity between the sequences are calculated. Based on this, the quality of each template can be evaluated. Further, according to the quality evaluation standard, several most representative templates are selected, and the probability distribution of these templates is adjusted to obtain a new template with higher quality than the original template. These templates will be used as keyword templates for the subsequent retrieval process.

[0030] The present invention will be further described below in conjunction with the accompanying drawings and specific embodiments.

[0031] Such as figure 1 Sho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention provides a method for screening and optimizing an audio-frequency keyword template. The method comprises the steps of step1, extracting features of each audio-frequency keyword template sample, making extracted features through a deep neural network and calculating posterior probabilities of all phonemes on a given phoneme set; step2, calculating posterior probability stability scores, pronunciation reliability scores and neighborhood similarity scores of the template; step3, calculating a weighted average of the above three scores of each audio-frequency keyword template and denoting the weighted average as an average score; step4, sequencing according to the order of average scores from large to small and selecting the first L audio-frequency keyword templates as representative pronunciation templates; step5, processing each representative pronunciation template, adjusting the posterior probability of each pronunciation unit on each frame of the pronunciation sequence and minimizing the neighborhood similarity scores of the template; generating optimized L audio-frequency retrieval word templates.

Description

technical field [0001] The invention belongs to the field of speech recognition, and in particular relates to a method for screening and optimizing audio keyword templates. Background technique [0002] The keyword retrieval task is to quickly find the location of a given keyword from large-scale and diverse speech data. In the keyword retrieval task based on speech clips, keywords to be retrieved are given in the form of a set of audio clip templates. These fragments are usually from different speakers or extracted from different contexts, and thus differ in the information they contain. In order to obtain retrieval results with better generalization, that is, in order to be able to process keywords from different speakers or with different contexts in the speech to be retrieved, it is necessary to make full use of as many audio clips as possible of a certain keyword. The traditional approach is to average all the templates belonging to a single keyword to obtain a single...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L15/02G10L15/26

CPCG10L15/02G10L15/26

Inventor 徐及张舸潘接林颜永红

Owner INST OF ACOUSTICS CHINESE ACAD OF SCI

Features

Generate Ideas
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A method for screening and optimizing audio keyword templates

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology