TransWikia.com

Why does joint embedding of word and images work?

Data Science Asked by Tianyi Ni on April 13, 2021

I often see some papers where the authors do point-wise multiplication of word and image embedding (e.g the image below).

Why does this implementation works? I do not understand.

enter image description here

One Answer

The model is more complex than point-wise multiplication of word and image embedding. It is a single neural network model thus backpropagation can improve the weights throughout the entire model. The training signal will adjust all layers to do better at the task, in this case the task appears to be question answering.

The figure is only showing the forward pass.

Answered by Brian Spiering on April 13, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP