Data Science Asked on December 25, 2020
When reading about deep learning I often come across the rule that deep learning is only effective when you have large amounts of data at your disposal. These statements are generally accompanied by a figure such as this:
The example (taken from https://hackernoon.com/%EF%B8%8F-big-challenge-in-deep-learning-training-data-31a88b97b282 ) is attributed to a ‘famous slide from Andrew Ng’. Does anyone know what this figure is actually based upon? Is there any research that backs up this claim?
The original slide in question "Scale drives deep learning progress" is possibly what you currently can find at https://cs230.stanford.edu/files/C1M1.pdf (page 13). It may be roughly interpreted with "low bias learners [in that plot, larger neural networks] tend to benefit from more training examples".
Answered by Davide Fiocco on December 25, 2020
The main reason is that in deep learning the number of training parameters are so many and there is a fact that for each parameter you need at least $5$ to $10$ data to have a good prediction. The reason is a bit complicated to explain but it is related to pack learning
and if you insist to know why, I can tell you that in the error term for the test data, you have an overfit term which grows with the number of sample size if your training model is a kind of hypothesis that increases when the number of data increases. In hypothesis with the growth of $O(2^n)$ it is impossible to make the generalisation error same as training error, such as 1NN on the contrary, hypothesis with the growth $O(n^c)$ which are limited to polynomials can have an overfit which can be diminished by increasing the size of training data. Consequently, if you increase the size of your data you can have better generalisation error. Deep learning models obey the second growth manner. The more data you have, the better generalisation you have.
Answered by Media on December 25, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP