speech emotion recongtion cnn or rnn?

Data Science Asked by Mohamed Amine on May 25, 2021

I want to build a speech emotion classifier and I labeled my data into 3 emotions {negative, neutral, positive} the speech files I have are different of length, thus my audio features (mfcc,zcr, etc.) are flexible in length I learned that lstm (RNN generally) are the kind of model that solves this kind of problem, but all the codes I found are implementing a CNN model or a CNN with LTSM

So here is my questions:
According to the definition of my emotions what are the audio features I should use.?
how we use cnn for speech emotion classifier?
What kind of model I should use and why?

audio recognition python rnn sentiment analysis

Add your own answers!

Ask a Question

Get help from others!

Recent Answers

Peter Machado on Why fry rice before boiling?
haakon.io on Why fry rice before boiling?
Lex on Does Google Analytics track 404 page responses as valid page views?
Joshua Engel on Why fry rice before boiling?
Jon Church on Why fry rice before boiling?