Data Science Asked by JackEarl on December 20, 2020
I am interested in parametric and non-parametric machine learning algorithms, their advantages and disadvantages and also their main differences regarding computational complexities. In particular I am interested in the parametric Gaussian Mixture Model (GMM) and the non-parametric kernel density estimation (KDE). I found out that if a "small" number of data points is used then parametric (like GMM/EM) are the better choice but if the amount of data points increases to a much higher number then non-parametric algorithms are better. Could someone please explain both in bit more detail regarding comparison?
Non-parametric machine learning algorithms try to make assumptions about the data given the patterns observed from similar instances. By not making assumptions, they are free to learn any functional form from the training data and hence are flexible.
Unlike parametric approach, where the number of parameters are fixed, in non-parametric approaches the number of parameters grow with training data.
If your data set is too small or otherwise is a set that is not representative of the entire population, then your result will be biased in more ways than possible with parametric methods. To get better results in Non-parametric machine learning algorithms we need large amount of data where relationship between features are not known.
A Gaussian mixture model (GMM) is basically a k-means clustering with probabilistic cluster assignment so it depends on number of components/clusters as parameter whereas in KDE, the free parameters are the kernel which specifies the shape of the distribution placed at each point and the kernel bandwidth which controls the size of the kernel at each point.
Refer this blog to visualize KDE better.
Answered by prashant0598 on December 20, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP