Speaker separation method and related equipment thereof

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A speaker separation and speaker technology, applied in speech analysis, instruments, etc., can solve the problems of low speaker separation accuracy and speaker separation.

Pending Publication Date: 2021-09-07

IFLYTEK CO LTD

View PDF0 Cites 3 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, due to defects in the related speaker separation technology, the related speaker separation technology cannot perform speaker separation for some complex speech data (such as voice data in which multiple speakers speak at the same time), which leads to the related speaker separation Technology's speaker separation accuracy is low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment 1

[0027] see figure 1 , which is a flow chart of a speaker separation method provided in the embodiment of the present application.

[0028] The speaker separation method provided in the embodiment of this application includes S1-S3:

[0029] S1: Obtain the voice data to be separated.

[0030] Wherein, the voice data to be separated refers to voice data that needs to be subjected to speaker separation processing; and the voice data to be separated includes voice information of at least one speaker. For example, the voice data to be separated may include voice information of N speakers. Wherein, N is a positive integer.

[0031] In addition, this embodiment of the present application does not limit the voice data to be separated, for example, the voice data to be separated may include at least one piece of overlapping audio data. Wherein, "overlapping audio data" refers to audio data generated by multiple speakers speaking at the same time.

[0032] In addition, in order to ...

example 1

[0086] Example 1, step 2311 may specifically include step 41-step 42:

[0087] Step 41: Arranging and combining the K pieces of predicted speech separation data corresponding to the g-th sample speech to obtain T permutations and combinations sequences corresponding to the K pieces of predicted speech separation data.

[0088] Step 42: According to the t-th permutation and combination sequence corresponding to the above K predicted speech separation data, the K pieces of predicted speech separation data and the K actual speech separation data corresponding to the g-th sample speech arranged in the first order respectively Establish a corresponding relationship among them, and obtain the corresponding relationship of the t-th candidate data corresponding to the g-th sample speech (as shown in formula (1)). Wherein, the "first order" can be preset; t is a positive integer, t≤T, and T is a positive integer.

[0089]

[0090] In the formula, Represents the corresponding rela...

example 2

[0092] Example 2, step 2311 may specifically include step 51-step 52:

[0093] Step 51: Permutate and combine the K actual speech separation data corresponding to the g-th sample speech, and obtain T permutation and combination sequences corresponding to the K actual speech separation data.

[0094] Step 52: According to the t-th arrangement and combination sequence corresponding to the K actual speech separation data, respectively establish the K prediction speech separation data corresponding to the K actual speech separation data and the g-th sample speech arranged in the second order Correspondence relationship, to obtain the tth candidate data correspondence relationship corresponding to the gth sample speech (as shown in formula (2)). Wherein, the "second order" can be preset; t is a positive integer, t≤T, and T is a positive integer.

[0095]

[0096] In the formula, Represents the corresponding relationship of the tth candidate data corresponding to the gth sampl...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a speaker separation method and related equipment thereof, and the method comprises the steps: inputting to-be-separated voice data into a pre-constructed voice separation model after the to-be-separated voice data comprising the voice information of at least one speaker is obtained; obtaining at least one part of voice separation data output by the voice separation model, so that speakers of voice information carried by each part of voice separation data are different (namely, different parts of voice separation data are used for recording voice information of different speakers); and according to the at least one part of voice separation data, determining a speaker separation result of the to-be-separated voice data, so that the speaker separation result can accurately represent a voice segment corresponding to each speaker in the to-be-separated voice data. In this way, the adverse effect caused by the fact that the multiple speakers corresponding to the overlapped audio data cannot be accurately recognized can be effectively avoided, and therefore the speaker separation accuracy can be effectively improved.

Description

technical field [0001] The present application relates to the technical field of speech processing, in particular to a speaker separation method and related equipment. Background technique [0002] The speaker separation technology can classify and organize each frame of audio data in the speech data according to different speakers, so as to combine multiple frames of audio data belonging to the same speaker into one speech segment, so that at least one speech segment can be obtained, so that The speakers of each speech segment are different, so that the speaker information can be respectively marked on each speech segment later. [0003] At present, with the development of speaker separation technology, there are more and more application scenarios of speaker separation technology. For example, speaker separation technology can be applied to application scenarios such as conference content organization and speech transcription. [0004] However, due to defects in the rela...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L17/02G10L17/04G10L17/18G10L19/02G10L19/04G10L21/0272

CPCG10L17/02G10L17/04G10L17/18G10L19/02G10L19/04G10L21/0272

Inventor 孙磊方昕吴明辉李永超刘俊华

Owner IFLYTEK CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Speaker separation method and related equipment thereof

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment 1

example 1

example 2

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology