How to find the best fitting parametric distribution for an empirical dataset (stock returns)?

Question

Given some real-valued empirical data (time series), I could convert it to a histogram to have an (non-parametric) empirical distribution of the data, but histograms are blocky and jagged.
Instead, I would like to identify the best-fitting parametric distribution from the scipy or scipy.stats libraries of distribution functions, so that I can artificially generate a parametric distribution that closely fits the empirical distribution of my real data and is continuous.
If the empirical data are monthly returns of empirical AAPL stock returns, for example, I know that the parametric Johnson-SU distribution resembles, and can mimic, stock return distributions because of its customizable skew. However, the Johnson SU distribution in scipy requires four input parameters to be calibrated. How can I search for the best parameter settings of this parametric distribution from scipy that fits to the empirical distribution of my sample of AAPL returns?

Shahriyar Mammadli · Answer

First of all, if you want to find the best distribution that fits your data you just iteratively fit your data to the longlist of distributions. Scipy supports most of them. After fitting, you can either use KS-test to find which distribution fitted best or you can use fit error to decide. This solution does what you want, also other solutions in that post are very good approaches to your problem.
Also, if you are sure about your distribution's being Johnson-SU distribution, to find the parameters, use Scipy's fit function, which will return you the parameters that best represent your data.
a, b, loc, scale = scipy.stats.johnsonsu.fit(data)

How to find the best fitting parametric distribution for an empirical dataset (stock returns)?

One Answer

Add your own answers!

Ask a Question