Eureka AIR delivers breakthrough ideas for toughest innovation challenges, trusted by R&D personnel around the world.

Phonation Style Detection

a phonation style and detection technology, applied in the field of phonation style detection, can solve the problems of lack of decision-making process on how to address different phonation styles, work well, and current methods also experience problems in situations of speech degraded

Inactive Publication Date: 2018-05-17
THE UNITED STATES OF AMERICA AS REPRESETNED BY THE SEC OF THE AIR FORCE
View PDF2 Cites 9 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Benefits of technology

The invention provides a method for detecting different styles of speech and making decisions based on those styles. This allows for the classification of audio messages based on their phonation styles, such as normal phonation, whispered phonation, softly spoken speech phonation, high-level phonation, babble phonation, and non-voice sounds. The purpose of this invention is to introduce the phonation style as a way to control computer software. The method involves detecting speech activity, extracting features from the detected speech activity, and characterizing those features before making decisions based on those characteristics. This allows for faster and more accurate speech recognition and control in conversational environments.

Problems solved by technology

Limitations on the current technologies mean that they only work well when the speaker is using a normal speaking or phonation style not including loud, babble, whisper, or pitch and they assume that the speaker wants to be heard and understood.
The current state of the art has considered noise degraded speech applications where the goal is to extract the target speech or suppress the interfering speech, but lacks a decision making process of how to address different phonation styles.
Current methods also experience issues in situations of degraded speech which can occur in almost any communication setting.
Typically, speech degradation is assumed to be due to environmental noise or communication channel artifacts.
However, speech degradation can also occur due to changes in phonation style.
A speech processing algorithm that is trained using normally phonated speech but is given whispered or high-volume phonation style speech will quickly degrade and create nonsensical outputs.
In an unconstrained, dynamically changing environment, speech recognizers have not succeeded in being able to accurately recognize the spoken dialogue.
This process is complicated by multiple speakers, noisy environment, and unstructured lexical information.
The current state of the art lacks an approach for uniquely classifying the various phonation styles; it also does not address how to make appropriate follow-on decisions.
When there are multiple speakers or the speaker wishes to obfuscate his / her communication, the output of speech recognizers quickly degrades into non-sense lexical information.

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Phonation Style Detection
  • Phonation Style Detection
  • Phonation Style Detection

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0030]The invention described herein allows an audio stream to be analyzed and classified based on the phonation style and then, depending on the application, a software control process is able to make appropriate control decisions. The goal of spoken language is to communicate. Communication occurs when the intended recipient of the spoken message receives the intended message from the speaker. While communication can occur based on body language, facial expressions, written words, and the spoken message, this invention addresses phonation style detection by using only the spoken message.

[0031]For typical speech applications, normal phonation is assumed and the audio applications attempt to process all the incoming audio. Some audio applications try to detect aberrations to this assumption, for example, if the energy level drops significantly. In this case, the algorithm may label it as whispered speech. This invention is unique in that no assumptions are made as to the type of pho...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a method for detecting phonation style in dynamic communication environments and making software control decisions based on phonation styles enabling an audio message to be classified based on the phonation style such as, but not limited to: normal phonation, whispered phonation, softly spoken speech phonation, high-level phonation, babble phonation, and non-voice sounds. The purpose of the invention is to introduce the phonation style as a way to control computer software.

Description

STATEMENT OF GOVERNMENT INTEREST[0001]The invention described herein may be manufactured and used by or for the Government for governmental purposes without the payment of any royalty thereon.BACKGROUND OF THE INVENTION[0002]Phonation is the rapid, periodic opening and closing of the glottis through separation and apposition of the vocal folds that, accompanied by breath under lung pressure, constitutes a source of vocal sound. Technology exists that detects sound created through phonation and attempts to understand and decode the sounds. Examples include Siri, automated phone menus, and Shazaam. Limitations on the current technologies mean that they only work well when the speaker is using a normal speaking or phonation style not including loud, babble, whisper, or pitch and they assume that the speaker wants to be heard and understood.[0003]Phonation style refers to different speaking styles which may include normal phonation, whispered speech phonation, low-level speech phonation...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L25/84G10L15/02H04L45/02H04L45/24H04L45/74
CPCG10L25/84G10L15/02G10L25/21G10L25/90G10L25/24G10L15/16G06F21/32H04L45/02G06F21/60G06F21/72H04L2209/08H04L9/002H04L2209/12G06F21/79H04L45/22H04L45/54H04L63/162
Inventor WENNDT, STANLEY J.HADDAD, DARREN M.
Owner THE UNITED STATES OF AMERICA AS REPRESETNED BY THE SEC OF THE AIR FORCE
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Eureka Blog
Learn More
PatSnap group products