Cross Validated Asked by Joel on February 21, 2021
Context: My goal is to fit a GEV distribution function to data $z$, where the location parameter is parametrised as linear combination of predictor variables $mu(vec{x}) = mu_0 + mu_1 x_1 + …$ (like the mean/location-parameter in linear regression). However, the amount of (potential) predictors $X$ is quite large, thus I plan to apply $l_1$ Lasso-regularisation on the respective parametrisation (c.f. an earlier question).
Question: Since I (a) know/assume the functional form (a GEV) and (b) try to not just optimise the expectation $E(Z | X=x)$ (but the full distribution), I assume it’s fair to regularise the log-likelihood when fitting the distribution (several articles seem to support this approach: 2 3). However, I never came across it in the literature. I assume, this is because (a) it requires knowing/assuming a functional form (which one usually doesn’t in statistical learning problems) and (b) it’s more costly to calculate the log-likelihood than for example a squared error loss. Is this correct, and/or are there further reasons for not using the likelihood for regularisation?
However, I never came across it in the literature.
There is a lot of literature on fitting distributions. Think for instance about Pearson's method of moments and chi-squared test which is already more than a hundred years old.
Fitting by optimizing the likelihood, the maximum likelihood, is also a method that is (just) more than a hundred years old. In addition Pearson's chi-squared test is finding replacement by the G-test, which is based on likelihood.
Regularised maximum likelihood methods are neither uncommon. Model selection methods that use values like BIC or AIC can be considered regularised likelihood regression (where the regularisation parameter is $Vert beta Vert_0$).
Thus it might be a matter of terminology that you do not read much about regularised maximum likelihood methods. Another related concept is Bayesian regression. The maximum a posteriori estimate could be considered as a regularised maximum likelihood.
But both these reasons are reasons why regularised likelihood may be not often used. But they are not reasons why you can not find anything about it in the literature.
Answered by Sextus Empiricus on February 21, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP