Looking for breakthrough ideas for innovation challenges? Try Patsnap Eureka!

Speech emotion recognition method through fusion of feature assessment and multi-layer perceptron

A technology of speech emotion recognition and multi-layer perceptron, which is applied in speech recognition, character and pattern recognition, speech analysis, etc., can solve the problems of reduced classification accuracy, complicated implementation process, reduced training and recognition efficiency, etc., to achieve improved classification Accuracy, simple implementation process, avoiding the effect of training process

Active Publication Date: 2017-11-24
HUNAN UNIV
View PDF2 Cites 13 Cited by
  • Summary
  • Abstract
  • Description
  • Claims
  • Application Information

AI Technical Summary

Problems solved by technology

[0003] At present, in the process of speech emotion recognition, a variety of speech features are usually used, including zero-crossing rate (ZCR), fundamental frequency (F0), energy (Energy), MFCC (Mel Frequency Cepstral Coefficient) and LFPC(), etc. , combined with classifications such as HMM (HiddenMarkov Model, Hidden Markov Model), GMM (Gaussian Mixture Model), SVM (Support Vector Machine, Support Vector Machine) and KNN (K-NearestNeighbor, K nearest neighbor), etc. The model performs emotion classification, but the number of features is usually very large when using the above-mentioned various speech features, and when the feature dimension is too large, it will cause a "dimension disaster", which makes the training process take a very long time, reducing training and recognition efficiency , which will decrease the classification accuracy
[0004] In order to solve the above-mentioned problem of excessively large feature dimensions, principal component analysis (Principlecommponent analysis, PCA), linear discriminant analysis (Linear discriminat analysis, LDA) or KPCA (Kernel Principal Component Analysis) and other methods are usually used for dimensionality reduction, while retaining Useful features, but the implementation process of this type of dimensionality reduction method is complicated, and the recognition accuracy is not high. If some practitioners propose to use MFCC, energy and other features to form 42-dimensional acoustic features, after KPCA dimensionality reduction, then use GMM - The SVM classifier performs classification and recognition, the implementation process is complex, and it still takes a lot of time to perform dimension reduction, and the recognition rate in EMO-DB can only reach 69.9%

Method used

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
View more

Image

Smart Image Click on the blue labels to locate them in the text.
Viewing Examples
Smart Image
  • Speech emotion recognition method through fusion of feature assessment and multi-layer perceptron
  • Speech emotion recognition method through fusion of feature assessment and multi-layer perceptron
  • Speech emotion recognition method through fusion of feature assessment and multi-layer perceptron

Examples

Experimental program
Comparison scheme
Effect test

Embodiment Construction

[0044] The present invention will be further described below in conjunction with the accompanying drawings and specific preferred embodiments, but the protection scope of the present invention is not limited thereby.

[0045] Such as figure 1 , 2 As shown, the speech emotion recognition method of this embodiment fusion feature evaluation and multi-layer perceptron, the steps include:

[0046] S1. Feature extraction: Extract the multi-dimensional emotional feature parameters of the training speech sets corresponding to various emotional states to obtain the original feature set;

[0047] S2. Feature evaluation: Rating and sorting the emotional feature parameters in the original feature set to obtain the sorted feature set;

[0048] S3. Optimal feature set selection: Obtain different numbers of emotional feature parameters from the sorted feature sets to form multiple feature subsets, and use the multi-layer perceptron MLP to classify each feature subset, and select the optimal ...

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

PUM

No PUM Login to View More

Abstract

The present invention discloses a speech emotion recognition method through fusion of feature assessment and a multi-layer perceptron. The method comprises the steps: S1, extracting multi-dimensional emotion feature parameters of a training speech set corresponding to various emotion states, and obtaining an original feature set; S2, performing rating ordering of various emotion feature parameters in an original feature set, and obtaining a feature set after ordering; S3, obtaining a plurality of feature subsets with different quantities from the feature set after ordering, using a multi-layer perceptron to perform classification of each feature subset, and selecting an optimal feature subset according to a classification result; and S4, using the multi-layer perceptron to train an emotion classification model for the optimal feature subset, and performing emotion recognition of the speech to be recognized through the classification model obtained through training. The realization method is simple, the speech emotion recognition method through fusion of feature assessment and the multi-layer perceptron can fuse the feature assessment and multi-layer perceptron to realize emotion recognition, and the emotion recognition precision and the efficiency are high.

Description

technical field [0001] The invention relates to the technical field of speech emotion recognition, in particular to a speech emotion recognition method integrating feature evaluation and multi-layer perceptron. Background technique [0002] Speech emotion recognition is expected to understand human emotions through computers to make intelligent and friendly responses, so that human-computer interaction is more natural and friendly. Compared with traditional human-computer interaction (human-computer interaction, HCI), speech emotion recognition It can provide more natural and friendly interactive applications for human-computer interaction, such as in call centers, distance learning and car driving. [0003] At present, in the process of speech emotion recognition, a variety of speech features are usually used, including zero-crossing rate (ZCR), fundamental frequency (F0), energy (Energy), MFCC (Mel Frequency Cepstral Coefficient) and LFPC(), etc. , combined with classific...

Claims

the structure of the environmentally friendly knitted fabric provided by the present invention; figure 2 Flow chart of the yarn wrapping machine for environmentally friendly knitted fabrics and storage devices; image 3 Is the parameter map of the yarn covering machine
Login to View More

Application Information

Patent Timeline
no application Login to View More
IPC IPC(8): G10L15/02G10L15/08G10L25/63G06K9/62
CPCG10L15/02G10L15/08G10L25/63G06F18/231G06F18/24G06F18/214
Inventor 赵欢王松陈佐谭彪
Owner HUNAN UNIV
Who we serve
  • R&D Engineer
  • R&D Manager
  • IP Professional
Why Patsnap Eureka
  • Industry Leading Data Capabilities
  • Powerful AI technology
  • Patent DNA Extraction
Social media
Patsnap Eureka Blog
Learn More
PatSnap group products