Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Phoneme detection method and device based on multiple tasks

A detection method and detection device technology, applied in speech analysis, speech recognition, instruments, etc., can solve the problem of low phoneme alignment accuracy, inability to complete phoneme recognition and phoneme alignment tasks at the same time, phoneme recognition tasks and phoneme alignment tasks cannot share learning Results and other issues, to achieve the effect of improving accuracy and accurate data support

Active Publication Date: 2021-02-26
SICHUAN CHANGHONG ELECTRIC CO LTD
View PDF13 Cites 1 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The purpose of the present invention is to provide a phoneme detection method and device based on multi-task, which is used to solve the problem that the phoneme recognition and phoneme alignment tasks cannot be completed at the same time, the accuracy of phoneme alignment is low, and the phoneme recognition task and phoneme alignment task cannot share learning results. question

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Phoneme detection method and device based on multiple tasks
  • Phoneme detection method and device based on multiple tasks
  • Phoneme detection method and device based on multiple tasks

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0029] A method for phoneme detection based on multi-task, comprising the following steps:

[0030] Step A1) training phoneme detection model; as image 3 As shown, training the phoneme detection model includes the following steps:

[0031] Step B1) Set the set of phonemes / syllables that need to be recognized, and arrange the phonemes / syllables in the set in order. Phonemes / syllables are the smallest phonetic units divided according to the natural attributes of speech, and are analyzed according to the pronunciation actions in the syllables , an action constitutes a phoneme.

[0032] Step B2) Obtain one or more speaker data sets, including speaker voice information and phoneme position annotation files corresponding to speaker voice information and phoneme / syllable arrangement order.

[0033] Step B3) Carry out segmentation according to the phoneme annotation file in step B2), and obtain the phoneme segmentation subsequence as the desired result.

[0034] Step B4) The origi...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a phoneme detection method based on multiple tasks. The phoneme detection method comprises the following steps: step A1) training a phoneme detection model; a2) obtaining a voice sequence to be detected; a3) segmenting the voice sequence into a plurality of basic sub-sequences; a4) moving the endpoints of the basic sub-sequences to obtain a group of transformation sub-sequence sets; a5) inputting all the transformation sub-sequences into a phoneme detection model to obtain predicted phonemes and corresponding confidence coefficients; a6) taking the transform sub-sequence with the highest confidence as a new basic sub-sequence; and a7) judging whether the basic sub-sequence meets a termination condition, if so, obtaining and outputting a phoneme detection result anda phoneme position, and if not, returning to the step a4). The technical problems that phoneme recognition and phoneme alignment tasks cannot be completed at the same time, the phoneme alignment accuracy is low, and the phoneme recognition task and the phoneme alignment task cannot share a learning result are solved.

Description

technical field [0001] The invention relates to the field of data intelligence, in particular to a multi-task-based phoneme detection method and device. Background technique [0002] With the development of deep learning technology, deep speech processing technologies such as speech recognition, voiceprint recognition, speech synthesis, and speech emotion analysis continue to break through. As the smallest speech unit divided by the natural properties of speech, phonemes play a very important role in deep speech processing and are the basis of most speech processing. At the same time, phonemes are of great significance to the rapid response of the deep speech processing system in actual scenarios. At the same time, there are very few speech databases containing phoneme alignment information in the existing data sets, and are limited by the phoneme definition specifications of the database itself. It is easy to encounter situations where the phoneme definition specifications...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/51G10L25/24G10L15/02G10L15/06
CPCG10L25/51G10L25/24G10L15/02G10L15/063G10L2015/025G10L2015/0631
Inventor 谢川
Owner SICHUAN CHANGHONG ELECTRIC CO LTD
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products