Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

A multi-modal non-contact sentiment analysis recording system

A technology of sentiment analysis and recording system, applied in the field of human-computer emotional interaction, which can solve problems such as high feature dimension, large number, and dimension disaster.

Active Publication Date: 2016-10-26
山东心法科技有限公司
View PDF7 Cites 0 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] The current research mainly extracts emotional feature information from speech prosody. The speech emotion recognition system mainly relies on the low-level acoustic features of speech for recognition. The representative features are pitch frequency, formant, short-term average zero-crossing rate and pronunciation duration. Time, etc., this method tends to lead to a higher feature dimension. Pattern recognition research shows that the accuracy rate is not proportional to the dimension of the feature space, and the generalization ability will be weakened in high-dimensional situations, and even lead to a higher dimensionality. disaster
[0004] There is also a linguistics approach that considers the emotional analysis of speech signals, considers the semantic components of the speech text, and uses the semantics and grammar of sentences to provide emotional clues to the speaker. History, word frequency, etc.; the disadvantage of this method is that it requires a lot of knowledge, which first brings difficulties to speech recognition, and semantic analysis requires relevant language knowledge, which increases the difficulty of sentiment analysis. The method is complex and difficult to implement at this stage
[0005] In the field of speech emotion information processing, almost all pattern recognition methods are used, such as artificial neural network (ANN), hidden Markov model (HMM), mixed Gaussian model (GMM), support vector machine (SVM), etc., but if the Comparing all these results together, it can be found that the means of feature extraction are extremely limited. Almost all studies use prosodic features or the linear combination and transformation of these prosodic features as the research object, and most of them are only in the audio mode. Analysis, so that the speech emotional features are always limited to a smaller category, not comprehensive enough

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • A multi-modal non-contact sentiment analysis recording system
  • A multi-modal non-contact sentiment analysis recording system
  • A multi-modal non-contact sentiment analysis recording system

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0050] In this example, if figure 1 As shown, the composition of a non-contact emotion analysis and recording system based on multimodality includes: sound receiving module: used to complete receiving sound from the external environment; sound feature extraction and processing module: used to obtain voice audio emotion labeling information; Speech recognition module: used to complete the conversion of voice content to text content; text feature extraction and processing module: used to obtain voice text emotion labeling information; comprehensive scheduling module: used to complete all data processing, storage, and scheduling tasks; display module : used to complete the display of the detected voice emotional state; clock module: used to complete time recording and provide the function of time label; storage module: used to complete the recording of emotion labeling information of all input voices in the power-on state; button module: used For switching, setting time, selectio...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention discloses a non-contact emotion analysis and recording system based on multimodality, which is characterized in that the composition includes: a sound receiving module for receiving sound from the external environment, and sound feature extraction for obtaining audio emotion labeling information of voice and processing module, a speech recognition module for converting speech content to text content, a text feature extraction and processing module for obtaining speech text emotion annotation information, and a comprehensive scheduling for completing all data processing, storage, and scheduling tasks module, a display module for completing the display of the detected voice emotional state, and a clock module for completing the functions of time recording and providing time labels. The invention can recognize voice emotion by combining two modalities of text and audio, thereby improving the accuracy of recognition.

Description

technical field [0001] The invention relates to the field of human-computer emotion interaction, in particular to a multimodal non-contact emotion analysis and recording system. Background technique [0002] Language is the most important tool for communication between people. Human speech includes text symbol information and also contains people's emotions. Artificial processing of emotional information features from speech is of great significance in the field of artificial intelligence. Humans communicate through language, and human emotions are expressed through multiple channels and modes, such as expressing emotions through language content, audio, facial expressions, and body movements. Speech emotion recognition is to identify the speaker's emotional information from voice signals. [0003] The current research mainly extracts emotional feature information from speech prosody. The speech emotion recognition system mainly relies on the low-level acoustic features of s...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Patents(China)
IPC IPC(8): G06F17/27G10L25/63
Inventor 孙晓孙重远高飞叶嘉麒任福继
Owner 山东心法科技有限公司
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products