单基因的差异分析

点击下载data

genegroup
TCGA-M9-A5M8-01A-11R-A28J-312375Tumor
TCGA-R6-A6DQ-01B-11R-A31P-314132Tumor
TCGA-LN-A9FP-01A-31R-A38D-311447Tumor
TCGA-LN-A4MQ-01A-11R-A28J-313247Tumor
TCGA-JY-A93D-01A-11R-A38D-311806Tumor
df <- read.csv("expr_mRNA_group.csv",row.names = 1)
boxplot(log2(gene)~group,data=df,col=c("green","red"))
## 计算p
wilcoxTest <- wilcox.test(gene~group,data=df)
pValue <- wilcoxTest$p.value
conGeneMeans <- mean( subset(df,group=="Normal")$gene)
treatGeneMeans <- mean( subset(df,group=="Tumor")$gene)
## 计算logFC
logFC <- log2(treatGeneMeans/conGeneMeans)

图片alt

图片alt

使用wilcox.test进行统计检验时应该使用那种表达量, FPKM还是count?

单基因的配对差异分析

单基因临床相关性分析

图片alt

图片alt

KS检验

kruskal.test(expr$YTHDC1~expr$stage)
kruskal.test(YTHDC1~stage,data=expr)
#boxplot(YTHDC1~stage,data=expr)
library(ggplot2)
library(ggpubr)
compare_means(YTHDC1 ~ stage, data = expr,method = "kruskal.test")
my_comparisons <- list( c("Stage I", "Stage II"),c("Stage I", "Stage III"),c("Stage I", "Stage IV"))
expr %>%
  ggplot(aes(x=stage,y=YTHDC1))+
  stat_boxplot(geom="errorbar",width=0.15,aes(color=stage))+
  geom_boxplot(aes(fill=stage),outlier.colour = NA)+
  ylim(5, 8)+
  stat_compare_means(comparisons = my_comparisons)+
  stat_compare_means(method = "kruskal.test",label = "p.format") 

图片alt

图片alt

逻辑回归

详细的逻辑回归原理

y <- ifelse(expr$YTHDC1>median(expr$YTHDC1),1,0)
logistic <- glm(y~expr$stage,family = binomial(link="logit"))
conf <- confint(logistic,level = 0.95)
summ <- summary(logistic)
cbind(OR=exp(summ$coefficients[,1]),
      OR.95L=exp(conf[,1]),
      OR.95H=exp(conf[,2]),
      p=summ$coefficients[,4])
OROR.95LOR.95Hp
1.1351351350.7298630841.7727786810.574001544
0.8131868130.4762847671.3839230340.446310221
0.9235791090.5251147771.6208328380.781756194
0.8534226190.4386203181.6562657710.639262696

单基因生存分析

参考