Accuracy gain vs amount of data in Neural Networks

Question

There's a theoretical question I tackled upon in the excellent book Neural Networks and Deep Learning by Michael Nielsen, which I would love to discuss about.
The question is:

How do our machine learning algorithms perform in the limit of very
large data sets? For any given algorithm it's natural to attempt to
define a notion of asymptotic performance in the limit of truly big
data. A quick-and-dirty approach to this problem is to simply try
fitting curves to graphs like those shown above, and then to
extrapolate the fitted curves out to infinity. An objection to this
approach is that different approaches to curve fitting will give
different notions of asymptotic performance. Can you find a principled
justification for fitting to some particular class of curves? If so,
compare the asymptotic performance of several different machine
learning algorithms.

Regarding a justification for fitting particular class of curves, empirically by viewing several datasets I've seen that usually if there's an accuracy gain by more data, the accuracy gain is linear in the amount of data.
Of course that doesn't always hold truth because it depends on the model. It has to be big and complex enough to fit and "benefit" from the additional data.
The above is just an assumption, and would love to hear more solid opinions.
Regarding the second question of comparing asymptotic performance of several machine learning algorithms, I didn't fully understand the question, but here's a comparison of SVM and FC network from the book:

Accuracy gain vs amount of data in Neural Networks

Add your own answers!

Ask a Question