Generalized linear models
linear models assume that the response variable is
- Normally distributed
- Constant variance
- Independent
There are many situations where these assumptions are inappropriate
- The response is either binary(0,1) or a count
- The response is continuous, but non-normal
Generalized linear models(GLMs): Response distribution is a member of the exponential family(normal, exponential, gammma, binomial, Poisson)
GLMs are simple models; include linear regression and OLS as aa special case
Parameter estimation is by maximum likelihood(assume that the response distribution is known)
Inference on parameters is based on large-sample or asymptotic theory
Random component: the distribution of y_i id from the exponential family:
f(y_i;\theta_i,\phi)=exp\left \{\frac{y_i\theta_i-b(\theta_i)}{a(\theta)}+h(y_i,\theta) \right \}
Normal, binomial, mutinomial, Poisson, exponential and gamma are all form exponential family.
Systematic component: linear predictor
\eta_i = \beta_0 + \beta_1x_{i1}+\dots +\beta_Kx_{iK}
Linkage:
g[E(y_i)]=g(\mu_i)=\eta_i
g(.) = link function = monotone and differentiable