Model training method and device, voice recognition method and device, server and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A model training and speech recognition technology, applied in speech recognition, speech analysis, instruments, etc., can solve problems such as poor robustness of speech recognition models, large differences in the accuracy of speech recognition models, training data cannot be shared, etc., to improve flexibility performance and accuracy, improving robustness and scalability

Active Publication Date: 2021-08-13

PING AN TECH (SHENZHEN) CO LTD

View PDF7 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

These have brought difficulties and challenges to speech recognition

[0003] The current speech recognition model can only recognize speech data of a single channel. For application scenarios with speech data of different channels, it is necessary to train multiple speech recognition models that match the speech data of each channel. The robustness of the speech recognition model Poor, and because the training data of different speech recognition models cannot be shared, resulting in a large difference in the accuracy of each speech recognition model, or requiring more training data, there are major drawbacks

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0036] The following will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.

[0037] The flow charts shown in the drawings are just illustrations, and do not necessarily include all contents and operations / steps, nor must they be performed in the order described. For example, some operations / steps can be decomposed, combined or partly combined, so the actual order of execution may be changed according to the actual situation. In addition, although the functional modules are divided in the schematic diagram of the device, in some cases, they ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention relates to model construction in artificial intelligence, and provides a model training method and device, a voice recognition method and device, a server and a storage medium. The model training method comprises the steps: carrying out first signal processing on voice data to obtain first voice data, and carrying out second signal processing on the voice data to obtain second voice data; inputting the first voice data and the second voice data into a feature extraction model to extract a first feature vector of the first voice data and a second feature vector of the second voice data; calculating mutual information between the first voice data and the second voice data according to the first feature vector and the second feature vector; according to mutual information between the first voice data and the second voice data, updating model parameters of the feature extraction model until the feature extraction model converges; and fusing and finely adjusting the converged feature extraction model and the trained speech recognition model to obtain a target speech recognition model. According to the invention, the robustness of the voice recognition model can be improved.

Description

technical field [0001] The present application relates to the technical field of model building, and in particular to a model training method, speech recognition method, device, server and storage medium. Background technique [0002] With the continuous development of the new media industry, the channels of voice data are gradually diversified, and there are different bandwidths and encoding formats. For example, voice data is recorded data with a sampling rate of 8k or 16k, or encoding formats such as ulaw, Alaw, and amr. In some cases, during the transmission of the voice data, processing such as compression is also performed on the voice data. These have brought difficulties and challenges to speech recognition. [0003] The current speech recognition model can only recognize speech data of a single channel. For application scenarios with speech data of different channels, it is necessary to train multiple speech recognition models that match the speech data of each cha...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L15/06G10L15/02G10L15/16

CPCG10L15/063G10L15/02G10L15/16

Inventor 王璐魏韬马骏王少军

Owner PING AN TECH (SHENZHEN) CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Model training method and device, voice recognition method and device, server and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology