Voice detection method based on multiple sound areas, related device and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A voice detection and sound zone technology, which is applied in voice analysis, voice recognition, instruments, etc., can solve the problems of low signal strength, poor effect, and large voice signal transmission loss when reaching the microphone array

Active Publication Date: 2020-10-27

TENCENT TECH (SHENZHEN) CO LTD

View PDF15 Cites 6 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, when there are multiple speakers in the environment, judging the main speaker only by the arrival signal strength is flawed, because the main speaker may be farther away from the microphone array than the interfering speaker. Although the volume of the main speaker may be greater than that of the interfering speaker, the propagation loss of the speech signal in the space is greater, so the signal strength reaching the microphone array may be smaller instead, resulting in poorer effect on subsequent speech processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0087] The embodiment of the present application provides a multi-sound region-based speech detection method, related devices, and storage media, which can retain or suppress speech signals in different directions through control signals in a multi-sound source scenario, so that real-time Separating and enhancing the voice of each user, thereby improving the accuracy of voice detection, which is conducive to improving the effect of voice processing.

[0088] The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above drawings are used to distinguish similar users, not necessarily Used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the application described herein, for example, can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The invention discloses a voice detection method based on multiple sound areas, the method is applied to the field of artificial intelligence, and the voice detection method provided by the inventioncomprises the following steps: obtaining sound area information corresponding to each sound area in N sound areas; generating a control signal corresponding to each sound area according to the sound area information corresponding to each sound area; processing the voice input signal corresponding to each sound area by adopting the control signal corresponding to each sound area to obtain a voice output signal corresponding to each sound area; and generating a voice detection result according to the voice output signal corresponding to each sound area. The invention further discloses a voice detection device and a storage medium. According to the invention, the voice signals from different directions can be processed in parallel based on the plurality of sound areas, and the voice signals in different directions are reserved or suppressed through the control signals in a multi-sound-source scene, so that the voice of each user can be separated and enhanced in real time, and the accuracyof voice detection is improved.

Description

technical field [0001] The present application relates to the field of artificial intelligence, and in particular to a multi-sound region-based speech detection method, related devices and storage media. Background technique [0002] With the wide application of far-field voice in people's daily life, in the multi-sound source (or multi-user) scenario, perform voice activity detection (VAD), separation, enhancement, and recognition for each possible sound source It has become a bottleneck for many intelligent voice products to improve their voice interaction performance. [0003] In the traditional scheme, a monophonic pre-processing system based on the main speaker detection algorithm is designed. The pre-processing system generally uses azimuth estimation combined with signal strength estimation, or azimuth estimation combined with spatial spectrum estimation. Estimate the speaker with the strongest signal energy (that is, the signal energy reaching the microphone array) ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

IPC IPC(8): G10L21/028G10L25/27G10L15/18G10L15/22

CPCG10L21/028G10L25/27G10L15/22G10L15/1815G10L2021/02166G10L2021/02087G10L21/0208G10L25/84G06T7/20G06T2207/30201G10L17/02G10L17/22G10L25/21

Inventor 郑脊萌陈联武黎韦伟段志毅于蒙苏丹姜开宇

Owner TENCENT TECH (SHENZHEN) CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

Voice detection method based on multiple sound areas, related device and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology