TransWikia.com

Poisson model with overdisperssion

Data Science Asked by ignatius on March 14, 2021

I’m working with a dataset $X$ (of length $N$) of count data, which looks like:

enter image description here

I developed a statistical model which can be improved, so I’m asking for any suggestions, for instance, differnet likelihoods or prior selection, different approach, anything…

My model

I’m trying to get the parameters of the likelihood of the data, so thaht I can get a posterior predictive density function, credible intervals and so on. Let’s say, I want to model the generative process of the data given some parameters, $f(X|theta)$

This data shows a large overdispersion ($bar X << var(X)$), thus a Poisson likelihood, $f(X|lambda) sim mathcal{Poisson}(lambda)$, is not a good choice.

Reading literature about count data with overdisperssion, I decided to model $f(X|lambda)$ as a Negative Binomial distribution, thus $f(X|lambda) sim mathcal NB(r, p)$

Parameter estimation

In order to not to end up with a very complex set-up, I’ve performed bayesian estiamtion of the hyperameter $p$, letting $r$ be computed from the data: in a Neagative Binomial distribution, $r$ is related to the first and second moments of the distribution following:

$
r = frac{mu^2}{sigma^2 – mu}, text then
$

$
hat r = frac{bar X^2}{var(X) – bar X}
$

The whole set-up is:

  • Likelihood: $f(X|p) = mathcal NB(hat r, p)$
  • Prior: $f(p) = mathcal Beta (0, 0)$ (non informative, improper prior)
  • Posterior: $f(p|X) = mathcal Beta (0 + hat rN, 0 + sum X)$

which returned the following posterior predictive distribution:

enter image description here

The first and second moments of the predictive posterior distribution are very close to those in the data (I’ve let the data have a huge impact in the posteriors since I’ve choosen a non-informative prior). Also, the point estimate posterior predictive (using $mu_p$) does not differ from an averaged predictive posterior distribution over all possible values of $p$.

Once again, any suggestions for improvement?

EDIT

What about a zero-truncated negative binomial distribution?

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP