Sound Design Asked by VGF on October 28, 2021
I’d like to know how recordings of many various sounds can be analyzed to allow for visualizations in two dimensions.
My idea would be to find two data features (e.g. using principal component analysis) that make every sound class (dog bark, baby cry, etc.) distinguishable from others.
I’m struggling to understand what parameters to focus on and which method to use.
Thanks for every comment.
For dimensionality reduction you need features to start with. You can for example extract MFCC’s or some other low-level features such as MPEG-7 descriptors. Then you can visualise them using PCA. TBH for this task you might be better of using t-SNE or UMAP to project this high dimensional data while preserving local clusters. Lastly, just have a look at YAMNet or VGGish models, which are already suited for SED task. You can extract embeddings and treat them as features for your visualisation.
Answered by jojek on October 28, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP