Data Science Asked by Mohamed Amine on May 25, 2021
I want to build a speech emotion classifier and I labeled my data into 3 emotions {negative, neutral, positive} the speech files I have are different of length, thus my audio features (mfcc,zcr, etc.) are flexible in length I learned that lstm (RNN generally) are the kind of model that solves this kind of problem, but all the codes I found are implementing a CNN model or a CNN with LTSM
So here is my questions:
According to the definition of my emotions what are the audio features I should use.?
how we use cnn for speech emotion classifier?
What kind of model I should use and why?
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP