Data Science Asked by Patrick Flynn on April 30, 2021
In BERT, multiple words in a single sentence can be masked at once. Does the model infer all of those words at once or iterate over them in either left to right or some other order?
For example:
The dog walked in the park.
Can be masked as
The [mask] walked in [mask] park.
In what order (if any) are these tokens predicted?
If you have any further reading on the topic, I’d appreciate it as well.
All the tokens are inferred at once, independently from one another.
Correct answer by noe on April 30, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP