Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Multi-modal emotion recognition method based on attention enhancing mechanism

A technology of emotion recognition and attention, applied in the field of affective computing

Active Publication Date: 2021-03-12
HANGZHOU DIANZI UNIV
View PDF8 Cites 18 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0005] In order to solve the problem of interaction between modalities in multimodal emotion recognition, the present invention proposes a multimodal emotion recognition method based on enhanced attention mechanism, and its specific technical scheme is as follows

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Multi-modal emotion recognition method based on attention enhancing mechanism
  • Multi-modal emotion recognition method based on attention enhancing mechanism
  • Multi-modal emotion recognition method based on attention enhancing mechanism

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0082] In order to make the object, technical solution and technical effect of the present invention clearer, the present invention will be further described in detail below in conjunction with the embodiments and the accompanying drawings.

[0083] Such as figure 1 As shown, the multimodal emotion recognition method based on enhanced attention mechanism of the present invention comprises the following steps:

[0084] Step 1: Extract the FBank acoustic features of the speech information, and then encode the FBank acoustic features through the multi-head attention mechanism to obtain the coding matrix of the speech signal; for the text information, use the pre-trained BERT model to convert each character in the text into The corresponding vector representation, so as to obtain the coding matrix of the entire text information;

[0085] Step 2: Perform dot multiplication of the encoding matrix of speech and text respectively to obtain the alignment matrix of speech and text, tex...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention belongs to the technical field of emotion calculation and relates to a multi-modal emotion recognition method based on an attention enhancement mechanism. The method comprises steps of obtaining a voice coding matrix through a multi-head attention mechanism, and obtaining a text coding matrix through a pre-trained BERT model; performing point multiplication on the coding matrixes ofthe voice and the text respectively to obtain alignment matrixes of the voice and the text, and calibrating the alignment matrixes with original modal coding information to obtain more local interaction information; and finally, splicing the coding information, the semantic alignment matrix and the interaction information of each mode as features to obtain a feature matrix of each mode; aggregating the voice feature matrix and the text feature matrix by using a multi-head attention mechanism; converting the aggregated feature matrix into vector representation through an attention mechanism; and splicing the vector representations of the voice and the text, and obtaining a final emotion classification result by using a full connection network. According to the method, a problem of multi-modal interaction is solved, and accuracy of multi-modal emotion recognition is improved.

Description

technical field [0001] The invention belongs to the technical field of emotion computing, in particular to a multimodal emotion recognition method based on an enhanced attention mechanism. Background technique [0002] As early as 1995, the concept of affective computing has been proposed. Affective computing aims to endow machines with the ability to observe, understand and express various emotions. In recent years, although we have made great progress in image processing, speech recognition, and natural speech understanding, there is still an insurmountable gap to establish a highly harmonious human-computer interaction environment. Modeling the complex emotional expressions of humans is very challenging, but it is also the most fundamental problem of human-computer interaction that needs to be solved urgently. [0003] With the continuous development of social networks, people express their emotions in more and more diverse forms. The traditional single emotion recogniti...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/06G10L15/16G10L25/30G10L25/45G10L25/63G06F40/126
CPCG10L15/063G10L15/16G10L25/63G10L25/30G10L25/45G06F40/126G10L2015/0631
Inventor 林菲刘盛强
Owner HANGZHOU DIANZI UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products