Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Mega speaker identification (ID) system and corresponding methods therefor

A speaker recognition and speaker technology, applied in the field of speaker recognition system, can solve the problem of not developing the optimal classification scheme and so on

Inactive Publication Date: 2005-08-31
KONINKLIJKE PHILIPS ELECTRONICS NV
View PDF0 Cites 40 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

Thus, while Scheirer and Slaney correctly deduced that classifier development should focus on a limited number of classifiers rather than the multiple classifiers proposed by others, they did not develop an optimal classification scheme or for classifying audio frames The optimal speaker recognition scheme for

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Mega speaker identification (ID) system and corresponding methods therefor
  • Mega speaker identification (ID) system and corresponding methods therefor
  • Mega speaker identification (ID) system and corresponding methods therefor

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0043] The present invention is based in part on the observation by Scheirer and Slaney that the features used to select a classifier are actually more critical to classification performance than the type of classifier itself. The inventors investigated a total of 143 classification features potentially useful for solving the problem of classifying continuous general audio data (GAD) into seven categories. The seven audio categories employed in the bulk speaker recognition (ID) system according to the present invention include silence, single speaker speech, music, ambient noise, multiple speaker speech, simultaneous speech and music, and speech and noise . It should be noted that the ambient noise category refers to noise without foreground sounds, while the speech and music category includes singing and speech with background music. Exemplary waveforms for six of the seven categories are figure 1 shown; waveforms of the silent category were omitted for self-explanatory rea...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

A memory storing computer readable instructions for causing a processor associated with a mega speaker identification (ID) system to instantiate functions including an audio segmentation and classification function (F10) receiving general audio data (GAD) and generating segments, a feature extraction function (F12) receiving the segments and extracting features based on mel-frequency cepstral coefficients (MFCC) therefrom, a learning and clustering function (14) receiving the extracted features and reclassifying segments, when required, based on the extracted features, a matching and labeling function (16) assigning a speaker ID to speech signals within the GAD, and a database function for correlating the assigned speaker ID to the respective speech signals within the GAD. The audio segmentation and classification function can assign each segment to one of N audio signal classes including silence, single speaker speech, music, environmental noise, multiple speaker's speech, simultaneous speech and music, and speech and noise.

Description

technical field [0001] The present invention relates generally to speaker identification (ID) systems. More specifically, the present invention relates to speaker recognition systems employing automatic audio signal segmentation based on Mel cepstral coefficients (MFCCs) extracted from the audio signal. A corresponding method suitable for processing signals from multiple audio signal sources is also disclosed. Background technique [0002] Speaker recognition systems currently exist. More specifically, there exist speaker recognition systems based on low-level audio features, which generally require that the set of speakers is known a priori. In such speaker recognition systems, when new audio material is analyzed, it is always classified into one of the known speaker classes. [0003] It should be noted that there are several research groups working on research and development on methods for automatic annotation of images and videos for content-based indexing and subsequ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/00G10L15/04G10L15/10G10L17/00G10L17/04G10L17/26G10L25/09G10L25/12G10L25/18G10L25/21G10L25/24G10L25/78G10L25/81G10L25/84G10L25/90
CPCG10L17/005G10L15/04G10L17/00G10L17/02
Inventor N·迪米特罗瓦D·李
Owner KONINKLIJKE PHILIPS ELECTRONICS NV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products