Gamma-Poisson Distribution
- https://const-ae.name/post/2021-01-24-gamma-poisson-distribution/gamma-poisson-reference/
The Gamma-Poisson distribution is a statistical distribution for overdispersed count data. It is also known under the name Negative Binomial. Unlike the Negative Binomial which is primarily used for repeated trials and number of success / failures, the Gamma-Poisson is parametrized by the mean μμ\mu and the overdispersion αα\alpha. If α=0α=0\alpha = 0, it reduces to the Poisson distribution.
General Properties
Definition: Y∼GammaPoisson(μ,α)Y∼GammaPoisson(μ,α) Y \sim \text\left(\mu,\alpha\right) where YYY are non-negative integer (i.e., “counts”).
Probability Mass Function
fGP(y|μ,α)=Γ(y+1/α)Γ(1/α)Γ(y+1)(μ2αμ+μ2α)y(11+μα)1/αfGP(y|μ,α)=Γ(y+1/α)Γ(1/α)Γ(y+1)(μ2αμ+μ2α)y(11+μα)1/α f_{\text}(y | \mu, \alpha) = \frac{\Gamma\left(y + 1/\alpha\right)}{\Gamma\left(1/\alpha\right)\Gamma\left(y + 1\right)} \left(\frac{\mu^2 \alpha}{\mu + \mu^2 \alpha}\right)^ \left(\frac{1}{1 + \mu \alpha}\right)^{1/\alpha}
Cumulative Distribution Function
FGP(y|μ,α)=I(11+μα;1/α,1+y)FGP(y|μ,α)=I(11+μα;1/α,1+y) F_{\text}(y | \mu, \alpha) = I\left(\frac{1}{1+\mu\alpha}; 1/\alpha, 1 + y\right) where I(z;a,b)I(z;a,b)I(z; a, b) is the regularized incomplete beta function.
Momements
Mean
E[Y]=μE[Y]=μ \mathbb[Y] = \mu
Variance
Var[Y]=μ+αμ2Var[Y]=μ+αμ2 \mathbb\text[Y] = \mu + \alpha \mu^2
Skewness
Skew[Y]=√μ1+μα(1+2μα)μSkew[Y]=μ1+μα(1+2μα)μ \mathbb\text[Y] = \frac{\sqrt{\frac{\mu}{1 + \mu\alpha}} \left(1 + 2 \mu\alpha \right)}{\mu}
Kurtosis
Kurtosis[Y]=1+6μα+6μ2α2μ+μ2α+3Kurtosis[Y]=1+6μα+6μ2α2μ+μ2α+3 \mathbb\text[Y] = \frac{1 + 6 \mu\alpha + 6 \mu^2\alpha^2}{\mu + \mu^2\alpha} + 3
Moment Generating Function
MX∼GP(t):=E[exp(tX)]=(11−μα(et−1))1/αMX∼GP(t):=E[exp(tX)]=(11−μα(et−1))1/α M_{X\sim\text}(t) := \mathbb[\exp\left(t X\right)] = \left(\frac{1}{1 - \mu\alpha (e^t-1)}\right)^{1/\alpha}
Characteristic Function
MX∼GP(t):=E[exp(tX)]=(11−μα(eit−1))1/αMX∼GP(t):=E[exp(tX)]=(11−μα(eit−1))1/α M_{X\sim\text}(t) := \mathbb[\exp\left(t X\right)] = \left(\frac{1}{1 - \mu\alpha (e^-1)}\right)^{1/\alpha}
Fisher Information
I(μ)=E[(∂∂μlogfGP(y))2∣∣∣μ]=1μ+αμ2I(μ)=E[(∂∂μlogfGP(y))2|μ]=1μ+αμ2 \mathcal(\mu) = \mathbb\left[\left(\frac{\partial}{\partial\mu}\log f_{\text}(y)\right)^2\;\Bigg|\; \mu \right] = \frac{1}{\mu + \alpha \mu^2}
Mathematica fails to calculate I(α)I(α)\mathcal(\alpha).
Useful derivatives
∂∂μlog(fGP(y|μ,α)=y−μμ+αμ2∂∂μlog(fGP(y|μ,α)=y−μμ+αμ2 \frac{\partial}{\partial\mu}\log\left(f_(y|\mu,\alpha\right) = \frac{y - \mu}{\mu+\alpha\mu^2}
∂∂β0log(fGP(y|μ,α)=y−μ1+αμ,∂∂β0log(fGP(y|μ,α)=y−μ1+αμ, \frac{\partial}{\partial\beta_0}\log\left(f_(y|\mu,\alpha\right) = \frac{y - \mu}{1+\alpha\mu}, where μ=exp(β0+offset)μ=exp(β0+offset)\mu = \exp\left(\beta_0+\text\right).
Relation to the Negative Binomial Distribution
The Gamma-Poisson and the Negative Binomial distribution are mathematically identical, they just use different parametrizations. However, this can cause a lot of confusion, especially because the RR\texttt and Wikipedia again use two different parametrizations of the Negative Binomial:
- Wikipedia uses rrr for the number of trials and ppp for the number of failures
- RR\texttt uses sizesize\text for the number of trials and p′p′p' for the number of successes, thus p′=1−pp′=1−pp' = 1 - p.
To convert from the Negative Binomial to the Gamma-Poisson parametrization μ=pr1−p=(1−p′)rp′α=1/r=1/size.μ=pr1−p=(1−p′)rp′α=1/r=1/size. \begin \mu &= \frac{1 - p} &&= \frac{(1-p') r}{p'} \\ \alpha &= 1/r &&= 1/\text. \end
To convert from the Gamma-Poisson to the Negative Binomial parametrization: p=αμ1+αμp′=11+αμp=αμ1+αμp′=11+αμ \begin p &= \frac{\alpha\mu}{1 + \alpha\mu} \\ p' &= \frac{1}{1+ \alpha\mu} \end and r=size=1/α.r=size=1/α. r = \text = 1/\alpha.
RR\texttt Implementation
Code
dgampoi <- function(x, mean, overdispersion){
gamma(x + 1/overdispersion)/(gamma(1/overdispersion) * gamma(x + 1)) *
((mean^2 * overdispersion) / (mean + mean^2 * overdispersion))^x *
(1/(1 + mean * overdispersion))^(1/overdispersion)
}
# Or based on the existing 'dnbinom'
dgampoi2 <- function(x, mean, overdispersion){
dnbinom(x, mu = mean, size = 1 / overdispersion)
}
Code
library(tidyverse)
cross_df(list(x = seq(0, 200), mu = 100, alpha = c(0, 0.05, 0.2, 2))) %>%
mutate(dens = dgampoi2(x, mean = mu, overdispersion = alpha)) %>%
ggplot(aes(x = x-0.5)) +
geom_step(aes(y = dens, color = as.factor(alpha)), show.legend = FALSE) +
annotate("text", x = 12, y = 0.03, label = expression(alpha==2), size = 7) +
annotate("text", x = 30, y = 0.003, label = expression(alpha==0.2), size = 7, hjust = 1) +
annotate("text", x = 120, y = 0.025, label = expression(alpha==0), size = 7) +
annotate("text", x = 80, y = 0.015, label = expression(alpha==0.05), size = 7, hjust = 1) +
cowplot::theme_cowplot(font_size = 20) +
coord_trans(xlim = c(0, 150), ylim = c(0, 0.04)) +
scale_y_continuous(expand = expansion(0)) +
scale_x_continuous(expand = expansion(0), breaks = seq(0, 140, by = 20)) +
labs(title ="Density of Gamma Poisson with µ=100",
x = "y", y = expression(fGP))
Code
cross_df(list(x = seq(0, 30), mu = 10, alpha = c(0, 0.05, 0.2, 2))) %>%
mutate(dens = dgampoi2(x, mean = mu, overdispersion = alpha)) %>%
ggplot(aes(x = x-0.5)) +
geom_step(aes(y = dens, color = as.factor(alpha)), show.legend = FALSE) +
cowplot::theme_cowplot(font_size = 20) +
coord_trans(xlim = c(0, 30)) +
scale_y_continuous(expand = expansion(0)) +
scale_x_continuous(expand = expansion(0), breaks = seq(0, 30, by = 5)) +
labs(title ="Density of Gamma Poisson with µ = 10",
x = "y", y = expression(fGP))