Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Method and device for multi-accent speech recognition

A speech recognition and accent technology, applied in the field of model training, can solve the problems of large scale of experts, redundant parameters, and inability to quickly adjust the model, so as to improve speech enhancement performance and reduce word error rate

Active Publication Date: 2021-11-02
AISPEECH CO LTD
View PDF15 Cites 2 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0004] In the process of implementing this application, the inventor found that the existing technical solutions have the following defects: using a multi-expert system, each expert has a large scale and redundant parameters, and cannot quickly adjust the model according to the difficulty of accent discrimination
In addition, each accent must have an expert system to focus on the relevant information of the accent, and the model has a large amount of data

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Method and device for multi-accent speech recognition
  • Method and device for multi-accent speech recognition
  • Method and device for multi-accent speech recognition

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0017] In order to make the purpose, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments It is a part of embodiments of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts fall within the protection scope of the present invention.

[0018] Please refer to figure 1 , which shows a flow chart of an embodiment of the method for multi-accent speech recognition of the present application, a method for multi-accent speech recognition of this embodiment, wherein, for a single speech recognition system, an adaptive Layers are used to learn feature information related to accents, inc...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a method and a device for multi-accent speech recognition. The method for multi-accent speech recognition comprises the steps: adding an adaptive layer for learning accent-related feature information in a coding stage for a single speech recognition system, enabling an accent representation vector to serve as guidance information for each encoder block, inputting into the adaptive layer and guiding a conversion function in the adaptive layer, wherein one encoder is provided with a plurality of encoder blocks which are connected in series; inputting accent irrelevant features into the adaptive layer at the same time; and mixing the accent irrelevant features and the accent representation vector to form accent relevant features. According to the embodiment of the invention, the injection position, accent cardinal number and different types of accent cardinal numbers of the adaptive layer are further discussed so that better accent adaptation is realized.

Description

technical field [0001] The invention belongs to the technical field of model training, and in particular relates to a method and device for multi-accent speech recognition. Background technique [0002] In related technologies, the end-to-end (E2E, End-to-End) Automatic Speech Recognition (ASR) model directly optimizes the probability of the output sequence given the input acoustic features, and has made great progress in various speech corpora. progress. One of the most pressing needs of ASR today is to support multiple accents in a single system, which is often referred to in the literature as multi-accent speech recognition. Difficulties in recognizing accented speech such as phonetics, phonetics, and grammar pose serious challenges to current ASR systems. A simple approach is to build a single ASR model from mixed data (accents from non-native speakers and standard data from native speakers). However, such models often suffer from severe performance degradation due to...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L15/07G10L15/16G10L15/22G10L15/26
CPCG10L15/07G10L15/16G10L15/22G10L15/26Y02T10/40
Inventor 钱彦旻龚勋卢怡宙周之恺
Owner AISPEECH CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products