病例对照研究队列研究
点击下载
处理数据
y <- ifelse(expr$YTHDC1>median(expr$YTHDC1),1,0) df <- data.frame(y=y,x=expr$stage,YTHDC1 = expr$YTHDC1)
建立模型
logistic <- glm(y~x,family = binomial(link="logit"),data = df) summ <- summary(logistic)
Call: glm(formula = y ~ x, family = binomial(link = "logit"), data = df) Deviance Residuals: Min 1Q Median 3Q Max -1.23169 -1.16396 -0.00977 1.19092 1.21159 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 0.1268 0.2255 0.562 0.574 xStage II -0.2068 0.2715 -0.762 0.446 xStage III -0.0795 0.2870 -0.277 0.782 xStage IV -0.1585 0.3382 -0.469 0.639 (Dispersion parameter for binomial family taken to be 1) Null deviance: 615.51 on 443 degrees of freedom Residual deviance: 614.83 on 440 degrees of freedom AIC: 622.83 Number of Fisher Scoring iterations: 3
查看系数
coefficients(logistic)
使用模型预测
predict_=predict.glm(logistic,type="response",newdata=df) table(predict_) my_predict <- function(x2=0,x3=0,x4=0){ a <- 0.12675171+ (x2* -0.20679441)+ (x3* -0.07949882) +(x4* -0.15850040) res <- exp(a)/(exp(a)+1) return(res) } my_predict() my_predict(x2=1) my_predict(x3=1) my_predict(x4=1)
图片alt
如果观测到某个分类变量,该分类变量的取值为1,其余虚拟变量的取值为0
计算OR值
conf <- confint(logistic,level = 0.95) cbind(OR=exp(summ$coefficients[,1]), OR.95L=exp(conf[,1]), OR.95H=exp(conf[,2]), p=summ$coefficients[,4])