Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Training method for bimodal emotion recognition model and bimodal emotion recognition method

A technology for emotion recognition and speech emotion recognition, applied in the training of dual-modal emotion recognition model and dual-modal emotion recognition field, can solve the problem of low accuracy of emotion recognition, and achieve the effect of improving training speed and high accuracy.

Active Publication Date: 2019-12-10
PEKING UNIV SHENZHEN GRADUATE SCHOOL
View PDF13 Cites 23 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

However, the accuracy of existing emotion recognition is relatively low

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Training method for bimodal emotion recognition model and bimodal emotion recognition method
  • Training method for bimodal emotion recognition model and bimodal emotion recognition method
  • Training method for bimodal emotion recognition model and bimodal emotion recognition method

Examples

Experimental program
Comparison scheme
Effect test

Embodiment 1

[0076] In order to facilitate the understanding of this embodiment, firstly, an electronic device that executes the dual-modal emotion recognition model training method or the dual-modal emotion recognition method disclosed in the embodiment of the present application is introduced in detail.

[0077] Such as figure 1 Shown is a block diagram of the electronic device. The electronic device 100 may include a memory 111 , a storage controller 112 , a processor 113 , a peripheral interface 114 , an input and output unit 115 , and a display unit 116 . Those of ordinary skill in the art can understand that, figure 1 The shown structure is only for illustration, and does not limit the structure of the electronic device 100 . For example, the electronic device 100 may also include a ratio figure 1 more or fewer components than shown in, or with figure 1 Different configurations are shown.

[0078] The memory 111 , storage controller 112 , processor 113 , peripheral interface 114...

Embodiment 2

[0086] see figure 2 , is a flowchart of a method for training a dual-modal emotion recognition model provided in an embodiment of the present application. The following will be figure 2 The specific process shown will be described in detail.

[0087] Step 201, input speech training data into the first neural network model for training, so as to obtain a speech emotion recognition model.

[0088] Optionally, step 201 may include: input speech training data into the first neural network model, and perform supervised training using a joint loss function composed of an affinity loss function (Affinity loss) and a focal loss function (Focal loss), to obtain speech Emotion recognition model.

[0089] Aiming at the problem of emotional confusion and emotional data category imbalance, the speech emotion recognition model uses the idea of ​​metric learning and uses the combined loss of Affinity loss and Focal loss as the loss function. Compared with the existing method that only ...

Embodiment 3

[0136] Based on the same application idea, an emotion recognition model training device corresponding to the emotion recognition model training method is also provided in the embodiment of the present application. Since the problem-solving principle of the device in the embodiment of the application is the same as the above-mentioned emotion recognition model training method in the embodiment of the application Similar, therefore, the implementation of the device can refer to the implementation of the method, and repeated descriptions will not be repeated.

[0137] see Figure 4 , is a schematic diagram of the functional modules of the emotion recognition model training device provided in the embodiment of the present application. Each module in the emotion recognition model training device in this embodiment is used to execute each step in the above method embodiment. The emotion recognition model training device includes: a first training module 301, a second training modul...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The invention provides a training method for a bimodal emotion recognition model and a bimodal emotion recognition method. The training method for the bimodal emotion recognition model comprises the following steps: inputting voice training data into a first neural network model for training, so that a voice emotion recognition model is obtained; inputting image training data into a second neuralnetwork model, and carrying out supervised training of the first stage by adopting a first loss function, so that an initial image emotion recognition model of the first stage is obtained; and inputting the image training data into the initial image emotion recognition model of the first stage, carrying out supervised training of the second stage by adopting a second loss function, so that a target image emotion recognition model is obtained, and carrying out decision fusion on the voice emotion recognition model and the target image emotion recognition model, so that the bimodal emotion recognition model is obtained.

Description

technical field [0001] The present application relates to the technical fields of voice processing and image processing, in particular, to a dual-modal emotion recognition model training method and a dual-modal emotion recognition method. Background technique [0002] Dual-modal emotion recognition combines speech signal processing, digital image processing, pattern recognition, psychology and other disciplines. It is an important branch of human-computer interaction and helps to provide better and more humanized user experience for human-computer interaction. It enables the robot to perceive and analyze the emotional state of the user, and then generate a corresponding response. Therefore, emotion recognition, as an important ability of the robot, has a wide range of research and application prospects. However, the accuracy of existing emotion recognition is relatively low. Contents of the invention [0003] In view of this, the purpose of the embodiments of the present ...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
Patent Type & Authority Applications(China)
IPC IPC(8): G10L25/63G10L25/30G06K9/62G06N3/04
CPCG10L25/63G10L25/30G06N3/045G06F18/214
Inventor 邹月娴张钰莹甘蕾
Owner PEKING UNIV SHENZHEN GRADUATE SCHOOL
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products