Language Model with Attention not learning

Data Science Asked by VM_AI on May 20, 2021

Language model with attention layer is not learning after 20 epochs. Both training and validation loss increase together, while the accuracy flattens at around 7%
The way input data is pipelined is by applying window of length 10 for each sentence so that the model will be able to make inferences on all words of vocab size rather than masking words randomly in each sentence. You can find the code here https://drive.google.com/file/d/1La83LKaZNHsGfCtxKtAWgAqAzLwHo9U5/view?usp=sharing

Orange = train
Blue = validation

Any suggestions on how to get this working would be helpful. Thanks.

attention mechanism language model nlp tensorflow

Add your own answers!

Ask a Question

Get help from others!

Recent Answers

Joshua Engel on Why fry rice before boiling?
Lex on Does Google Analytics track 404 page responses as valid page views?
Peter Machado on Why fry rice before boiling?
Jon Church on Why fry rice before boiling?
haakon.io on Why fry rice before boiling?