Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Audio indexing method based on multi-distance sound sensor

An acoustic sensor and distance technology, applied in instruments, speech analysis, speech recognition, etc., can solve problems such as limited training data, small sample problems, and loss of identification information

Active Publication Date: 2013-06-12
TSINGHUA UNIV
View PDF2 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

The GMM-SVM method has better performance, but there are still the following problems: GMM has too many parameters when estimating the probability density, the training data is limited, and GMM-SVM is mainly aimed at speaker recognition and has not developed into a general technology.
Disadvantages of the LPP-based algorithm Dimensionality reduction processing will affect the flow distribution of data, resulting in loss of identification information and small sample problems, etc.
For the small sample problem, Yang et al. proposed a Null-space Locality Preserving Projection algorithm (Null-space Locality Preserving Projections, NDLPP), but this method only uses the identification information of the null space and ignores the identification information in the pivot space.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Audio indexing method based on multi-distance sound sensor
  • Audio indexing method based on multi-distance sound sensor
  • Audio indexing method based on multi-distance sound sensor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0048] The present invention will be described in further detail below in conjunction with the accompanying drawings and embodiments.

[0049] The input devices of SPKR include headset microphone, single microphone, microphone array and multiple distance microphones (Multiple Distance Microphones). The multi-distance acoustic sensor meets the requirements of complex dialogue scenarios with multiple sound sources and directions, and can be applied to sound source localization, speaker clustering and identification, etc. Based on the particularity of the multi-distance acoustic sensor topology, the multi-time-delay feature can be used to classify spatially non-overlapping sound sources.

[0050] Such as figure 1 As shown, it is a multi-distance acoustic sensor system, including multiple acoustic sensors, figure 1 The four acoustic sensors 111-114 are represented by four of them, and these four acoustic sensors are randomly placed on the same platform. Likewise, only three sou...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses an audio indexing method based on a multi-distance sound sensor. In the method, a multi-distance sound sensor is used as an audio recording device for recording the audio information in a multimedia conference, a space multi-delay feature is extracted based on the multi-distance sound sensor as a feature for distinguishing different speakers, and a new flow-type algorithm is adopted to perform dimension reduction of the multi-delay feature and classify the speakers according to the identities. The method can reduce the complexity and calculation cost of the system; finally, the audio segment and identity of each speaker are output by the system as audio index information; the optimal discriminant vector set theory obtained by the method can achieve optimal discrimination theoretically; and the method can be applied to a multi-people multi-party conversion scene in a complicated acoustic environment.

Description

technical field [0001] The invention belongs to the technical field of audio and relates to audio indexing, in particular to an audio indexing method based on a multi-distance acoustic sensor. Background technique [0002] Teleconferencing and video conferencing have increasingly penetrated into business activities and daily life, and the corresponding recorded data has shown a geometric growth. In such scenarios, there are usually multiple sound sources in a piece of audio data. Such data can be processed through audio indexing techniques, offloading post-processing methods such as speech recognition. [0003] Audio indexing technology automatically extracts information from audio data to search and discover target content. Speaker classification is the key technology of audio indexing. Speaker classification technology includes three parts: feature extraction, speech segmentation, and classification decision-making. The main algorithms are mixed Gaussian log-likelihood ra...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G10L15/08
Inventor 杨毅陈国顺王胜开
Owner TSINGHUA UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products