Magenta MusicVAE/GrooVAE conditioning

Data Science Asked by Johannes Ebster on August 24, 2021

I want to try different methods of conditioning the decoding process of the Variational Autoencoder Models of the Google Magenta project for my own research project. As far as I can tell, MusicVAE has already been conditioned by the authors on chords (e.g., for the ‘hier-multiperf_vel_1bar_med_chords’ model). I want to also try other methods, like style tags or diatonicity etc.

However, I am having a hard time of figuring out where the respective tensors (one-hot-encoded chords I think) are used during training in the hierarchical decoder. Are the same conditioning tensors concatenated to every decoding step? Or only the first? Or something else? Since it is difficult to figure this out by looking at the code and the paper (https://arxiv.org/pdf/1803.05428.pdf) does not mention this architectural concern, I thought maybe a person involved could clear this up for me. Here is the picture of the MusicVAE architecture as depicted in the just mentioned paper.

P.S.: I also posted this very same question here https://stackoverflow.com/questions/63029684/magenta-musicvae-groovae-conditioning , I can delete it if that is against the rules.

generative models neural network

Add your own answers!

Ask a Question

Get help from others!