A kind of voice detection method based on multi-tone zone, related device and storage medium

What is Al technical title?
Al technical title is built by PatSnap Al team. It summarizes the technical point description of the patent document.
A voice detection and sound zone technology, applied in voice analysis, voice recognition, instruments, etc., can solve the problems of low signal strength reaching the microphone array, large loss of voice signal propagation, and poor effect, so as to improve the effect of voice processing and improve accuracy. degree of effect

Active Publication Date: 2022-07-26

TENCENT TECH (SHENZHEN) CO LTD

View PDF11 Cites 0 Cited by

Summary
Abstract
Description
Claims
Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology

Problems solved by technology

[0004] However, when there are multiple speakers in the environment, judging the main speaker only by the arrival signal strength is flawed, because the main speaker may be farther away from the microphone array than the interfering speaker. Although the volume of the main speaker may be greater than that of the interfering speaker, the propagation loss of the speech signal in the space is greater, so the signal strength reaching the microphone array may be smaller instead, resulting in poorer effect on subsequent speech processing

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Image

Smart Image Click on the blue labels to locate them in the text.

Viewing Examples

Smart Image

Examples

Experimental program

Comparison scheme

Effect test

Embodiment Construction

[0087] The embodiments of the present application provide a voice detection method, a related device, and a storage medium based on a multi-sound zone, which can retain or suppress voice signals in different directions through a control signal in a multi-sound source scenario, so that real-time voice signals can be retained or suppressed. The voice of each user is separated and enhanced, thereby improving the accuracy of voice detection and improving the effect of voice processing.

[0088] The terms "first", "second", "third", "fourth", etc. (if any) in the description and claims of this application and the above-mentioned drawings are used to distinguish similar users, and are not necessarily used to describe a specific order or sequence. It is to be understood that the data so used may be interchanged under appropriate circumstances such that the embodiments of the application described herein can, for example, be practiced in sequences other than those illustrated or descr...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

PUM

Login to View More

Abstract

The present application discloses a voice detection method based on multi-tone zones, which is applied to the field of artificial intelligence. The voice detection method provided by the present application includes: acquiring the voice zone information corresponding to each voice zone in N voice zones; The sound area information corresponding to each sound area is used to generate the control signal corresponding to each sound area; using the control signal corresponding to each sound area, the voice input signal corresponding to each sound area is processed to obtain each sound area. The voice output signal corresponding to the sound zone; the voice detection result is generated according to the voice output signal corresponding to each voice zone. The present application also discloses a voice detection device and a storage medium. The present application can process speech signals from different directions in parallel based on multiple sound zones. In the scenario of multiple sound sources, the control signals can be used to retain or suppress speech signals in different directions, so that each user can be separated and enhanced in real time. voice, thereby improving the accuracy of voice detection.

Description

technical field [0001] The present application relates to the field of artificial intelligence, and in particular, to a method for detecting speech based on multi-tone regions, a related device and a storage medium. Background technique [0002] With the widespread application of far-field speech in people's daily life, in multi-sound source (or multi-user) scenarios, voice activity detection (VAD), separation, enhancement, and recognition are performed for each possible sound source. and call processing, has become a bottleneck for a variety of intelligent voice products to improve their voice interaction performance. [0003] In the traditional scheme, a monophonic pre-processing system based on the main speaker detection algorithm is designed. The pre-processing system generally adopts the method of azimuth angle estimation combined with signal strength estimation, or the method of azimuth angle estimation combined with spatial spectrum estimation. The speaker with the s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine

Login to View More

Application Information

Patent Timeline

Login to View More

Patent Type & Authority Patents(China)

IPC IPC(8): G10L21/028G10L25/27G10L15/18G10L15/22

CPCG10L21/028G10L25/27G10L15/22G10L15/1815G10L2021/02166G10L2021/02087G10L21/0208G10L25/84G06T7/20G06T2207/30201G10L17/02G10L17/22G10L25/21

Inventor 郑脊萌陈联武黎韦伟段志毅于蒙苏丹姜开宇

Owner TENCENT TECH (SHENZHEN) CO LTD

Features

R&D
Intellectual Property
Life Sciences
Materials
Tech Scout

Why Patsnap Eureka

Unparalleled Data Quality
Higher Quality Content
60% Fewer Hallucinations

Social media

Patsnap Eureka Blog

Learn More

Browse by: Latest US Patents, China's latest patents, Technical Efficacy Thesaurus, Application Domain, Technology Topic, Popular Technical Reports.

A kind of voice detection method based on multi-tone zone, related device and storage medium

AI Technical Summary This helps you quickly interpret patents by identifying the three key elements: Problems solved by technologyMethod usedBenefits of technology

Problems solved by technology

Method used

Image

Examples

Embodiment Construction

PUM

Abstract

Description

Claims

Application Information

AI Technical Summary
This helps you quickly interpret patents by identifying the three key elements:
Problems solved by technology
Method used
Benefits of technology