Gamma-Poisson Distribution

最后发布时间 : 2025-01-14 23:37:42 浏览量 :

https://const-ae.name/post/2021-01-24-gamma-poisson-distribution/gamma-poisson-reference/
The Gamma-Poisson distribution is a statistical distribution for overdispersed count data. It is also known under the name Negative Binomial. Unlike the Negative Binomial which is primarily used for repeated trials and number of success / failures, the Gamma-Poisson is parametrized by the mean μμ\mu and the overdispersion αα\alpha. If α=0α=0\alpha = 0, it reduces to the Poisson distribution.

General Properties

Definition: Y∼GammaPoisson(μ,α)Y∼GammaPoisson(μ,α) Y \sim \text\left(\mu,\alpha\right) where YYY are non-negative integer (i.e., “counts”).

Probability Mass Function

fGP(y|μ,α)=Γ(y+1/α)Γ(1/α)Γ(y+1)(μ2αμ+μ2α)y(11+μα)1/αfGP(y|μ,α)=Γ(y+1/α)Γ(1/α)Γ(y+1)(μ2αμ+μ2α)y(11+μα)1/α f_{\text}(y | \mu, \alpha) = \frac{\Gamma\left(y + 1/\alpha\right)}{\Gamma\left(1/\alpha\right)\Gamma\left(y + 1\right)} \left(\frac{\mu^2 \alpha}{\mu + \mu^2 \alpha}\right)^ \left(\frac{1}{1 + \mu \alpha}\right)^{1/\alpha}

Cumulative Distribution Function

FGP(y|μ,α)=I(11+μα;1/α,1+y)FGP(y|μ,α)=I(11+μα;1/α,1+y) F_{\text}(y | \mu, \alpha) = I\left(\frac{1}{1+\mu\alpha}; 1/\alpha, 1 + y\right) where I(z;a,b)I(z;a,b)I(z; a, b) is the regularized incomplete beta function.

Momements

Mean

E[Y]=μE[Y]=μ \mathbb[Y] = \mu

Variance

Var[Y]=μ+αμ2Var[Y]=μ+αμ2 \mathbb\text[Y] = \mu + \alpha \mu^2

Skewness

Skew[Y]=√μ1+μα(1+2μα)μSkew[Y]=μ1+μα(1+2μα)μ \mathbb\text[Y] = \frac{\sqrt{\frac{\mu}{1 + \mu\alpha}} \left(1 + 2 \mu\alpha \right)}{\mu}

Kurtosis

Kurtosis[Y]=1+6μα+6μ2α2μ+μ2α+3Kurtosis[Y]=1+6μα+6μ2α2μ+μ2α+3 \mathbb\text[Y] = \frac{1 + 6 \mu\alpha + 6 \mu^2\alpha^2}{\mu + \mu^2\alpha} + 3

Moment Generating Function

MX∼GP(t):=E[exp(tX)]=(11−μα(et−1))1/αMX∼GP(t):=E[exp⁡(tX)]=(11−μα(et−1))1/α M_{X\sim\text}(t) := \mathbb[\exp\left(t X\right)] = \left(\frac{1}{1 - \mu\alpha (e^t-1)}\right)^{1/\alpha}

Characteristic Function

MX∼GP(t):=E[exp(tX)]=(11−μα(eit−1))1/αMX∼GP(t):=E[exp⁡(tX)]=(11−μα(eit−1))1/α M_{X\sim\text}(t) := \mathbb[\exp\left(t X\right)] = \left(\frac{1}{1 - \mu\alpha (e^-1)}\right)^{1/\alpha}

Fisher Information

I(μ)=E[(∂∂μlogfGP(y))2∣∣∣μ]=1μ+αμ2I(μ)=E[(∂∂μlog⁡fGP(y))2|μ]=1μ+αμ2 \mathcal(\mu) = \mathbb\left[\left(\frac{\partial}{\partial\mu}\log f_{\text}(y)\right)^2\;\Bigg|\; \mu \right] = \frac{1}{\mu + \alpha \mu^2}

Mathematica fails to calculate I(α)I(α)\mathcal(\alpha).

Useful derivatives

∂∂μlog(fGP(y|μ,α)=y−μμ+αμ2∂∂μlog⁡(fGP(y|μ,α)=y−μμ+αμ2 \frac{\partial}{\partial\mu}\log\left(f_(y|\mu,\alpha\right) = \frac{y - \mu}{\mu+\alpha\mu^2}

∂∂β0log(fGP(y|μ,α)=y−μ1+αμ,∂∂β0log⁡(fGP(y|μ,α)=y−μ1+αμ, \frac{\partial}{\partial\beta_0}\log\left(f_(y|\mu,\alpha\right) = \frac{y - \mu}{1+\alpha\mu}, where μ=exp(β0+offset)μ=exp⁡(β0+offset)\mu = \exp\left(\beta_0+\text\right).

Relation to the Negative Binomial Distribution

The Gamma-Poisson and the Negative Binomial distribution are mathematically identical, they just use different parametrizations. However, this can cause a lot of confusion, especially because the RR\texttt and Wikipedia again use two different parametrizations of the Negative Binomial:

Wikipedia uses rrr for the number of trials and ppp for the number of failures
RR\texttt uses sizesize\text for the number of trials and p′p′p' for the number of successes, thus p′=1−pp′=1−pp' = 1 - p.

To convert from the Negative Binomial to the Gamma-Poisson parametrization μ=pr1−p=(1−p′)rp′α=1/r=1/size.μ=pr1−p=(1−p′)rp′α=1/r=1/size. \begin \mu &= \frac{1 - p} &&= \frac{(1-p') r}{p'} \\ \alpha &= 1/r &&= 1/\text. \end

To convert from the Gamma-Poisson to the Negative Binomial parametrization: p=αμ1+αμp′=11+αμp=αμ1+αμp′=11+αμ \begin p &= \frac{\alpha\mu}{1 + \alpha\mu} \\ p' &= \frac{1}{1+ \alpha\mu} \end and r=size=1/α.r=size=1/α. r = \text = 1/\alpha.

RR\texttt Implementation

Code

dgampoi <- function(x, mean, overdispersion){
gamma(x + 1/overdispersion)/(gamma(1/overdispersion) * gamma(x + 1)) *
((mean^2 * overdispersion) / (mean + mean^2 * overdispersion))^x *
(1/(1 + mean * overdispersion))^(1/overdispersion)
}

# Or based on the existing 'dnbinom'
dgampoi2 <- function(x, mean, overdispersion){
dnbinom(x, mu = mean, size = 1 / overdispersion)
}

Code

library(tidyverse)
cross_df(list(x = seq(0, 200), mu = 100, alpha = c(0, 0.05, 0.2, 2))) %>%
mutate(dens = dgampoi2(x, mean = mu, overdispersion = alpha)) %>%
ggplot(aes(x = x-0.5)) +
geom_step(aes(y = dens, color = as.factor(alpha)), show.legend = FALSE) +
annotate("text", x = 12, y = 0.03, label = expression(alpha==2), size = 7) +
annotate("text", x = 30, y = 0.003, label = expression(alpha==0.2), size = 7, hjust = 1) +
annotate("text", x = 120, y = 0.025, label = expression(alpha==0), size = 7) +
annotate("text", x = 80, y = 0.015, label = expression(alpha==0.05), size = 7, hjust = 1) +
cowplot::theme_cowplot(font_size = 20) +
coord_trans(xlim = c(0, 150), ylim = c(0, 0.04)) +
scale_y_continuous(expand = expansion(0)) +
scale_x_continuous(expand = expansion(0), breaks = seq(0, 140, by = 20)) +
labs(title ="Density of Gamma Poisson with µ=100",
x = "y", y = expression(fGP))

Code

cross_df(list(x = seq(0, 30), mu = 10, alpha = c(0, 0.05, 0.2, 2))) %>%
mutate(dens = dgampoi2(x, mean = mu, overdispersion = alpha)) %>%
ggplot(aes(x = x-0.5)) +
geom_step(aes(y = dens, color = as.factor(alpha)), show.legend = FALSE) +
cowplot::theme_cowplot(font_size = 20) +
coord_trans(xlim = c(0, 30)) +
scale_y_continuous(expand = expansion(0)) +
scale_x_continuous(expand = expansion(0), breaks = seq(0, 30, by = 5)) +
labs(title ="Density of Gamma Poisson with µ = 10",
x = "y", y = expression(fGP))