利用CellChat进行细胞间通信的推断与分析
CellChat(github)一种能够从单细胞RNA测序(scRNA-seq)数据定量推断和分析细胞间通信网络的工具。它使用网络分析和模式识别方法预测细胞的主要信号输入和输出,以及这些细胞和信号如何协调功能。
- 安装
devtools::install_github("sqjin/CellChat")
从github下载,使用本的安装的方式:
remotes::install_local("./Downloads/CellChat-1.5.0.tar.gz")
参考github上的教程CellChat-vignette本文给出使用CellChat
对细胞通讯网络的推断和可视化。
CellChat需要细胞的基因表达数据作为输入,通过整合基因表达与
信号配体(signaling ligands)、受体(receptors) 及其 辅因子(cofactors) 之间相互作用的先验知识,对细胞间通信的概率进行建模。
加载CellChat包
library(CellChat)
初始化CellChat对象
CellChat需要两个输入文件,一个是细胞的基因表达数据,另一个是注释的细胞类型(label-based mode)或者单细胞数据的低纬表示(label-free mode)。对于后者label-free mode
CellChat将自动的进行分组(by building a shared neighbor graph based on the cell-cell distance in the low-dimensional space or the pseudotemporal trajectory space)。
加载测试数据
输入的基因表达矩阵应该被标准化并且进行log转换(library-size normalization and then log-transformed with a pseudocount of 1)。
如果输入的是count数据也可以使用CellChat包中提供的normalizeData
函数进行标准化和log转换。
#load(url("https://ndownloader.figshare.com/files/25950872")) # This is a combined data from two biological conditions: normal and diseases
download.file("http://oss.bioinfo.online/cms/image/data_humanSkin_CellChat_1683686262168.rda","data_humanSkin_CellChat.rda")
load("data_humanSkin_CellChat.rda")
data.input = data_humanSkin$data # normalized data matrix
meta = data_humanSkin$meta # a dataframe with rownames containing cell mata data
cell.use = rownames(meta)[meta$condition == "LS"] # extract the cell names from disease data
# Prepare input data for CelChat analysis
data.input = data.input[, cell.use]
meta = meta[cell.use, ]
创建CellChat对象
cellchat <- createCellChat(object = data.input, meta = meta, group.by = "labels")
cellchat <- addMeta(cellchat, meta = meta)
cellchat <- setIdent(cellchat, ident.use = "labels") # set "labels" as default cell identity
levels(cellchat@idents) # show factor levels of the cell labels
groupSize <- as.numeric(table(cellchat@idents)) # number of cells in each cell group
> str(cellchat) Formal class 'CellChat' [package "CellChat"] with 15 slots ..@ data.raw : num[0 , 0 ] ..@ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots .. .. ..@ i : int [1:4119559] 0 1 6 13 15 16 17 22 24 30 ... .. .. ..@ p : int [1:5012] 0 4064 7504 10940 13718 16363 19325 22489 25141 27734 ... .. .. ..@ Dim : int [1:2] 17328 5011 .. .. ..@ Dimnames:List of 2 .. .. .. ..$ : chr [1:17328] "A1BG" "A1BG-AS1" "A2M" "A2M-AS1" ... .. .. .. ..$ : chr [1:5011] "S1_AACTCCCAGAGCTGCA" "S1_CAACCAATCCTCATTA" "S1_CGCTATCTCCTAGTGA" "S1_ATTTCTGCAGGACGTA" ... .. .. ..@ x : num [1:4119559] 0.577 0.577 0.941 1.207 0.941 ... .. .. ..@ factors : list() ..@ data.signaling: num[0 , 0 ] ..@ data.scale : num[0 , 0 ] ..@ data.project : num[0 , 0 ] ..@ images : list() ..@ net : list() ..@ netP : list() ..@ meta :'data.frame': 5011 obs. of 3 variables: .. ..$ patient.id: chr [1:5011] "Patient1" "Patient1" "Patient1" "Patient1" ... .. ..$ condition : chr [1:5011] "LS" "LS" "LS" "LS" ... .. ..$ labels : Factor w/ 12 levels "APOE+ FIB","FBN1+ FIB",..: 4 2 4 4 4 4 2 4 2 2 ... ..@ idents : Factor w/ 12 levels "APOE+ FIB","FBN1+ FIB",..: 4 2 4 4 4 4 2 4 2 2 ... ..@ DB : list() ..@ LR : list() ..@ var.features : list() ..@ dr : list() ..@ options :List of 2 .. ..$ mode : chr "single" .. ..$ datatype: chr "RNA"
设置受体-配体(ligand-receptor)交互数据库CellChatDB
当前是在human和mouse手动创建的文献中支持的受体-配体交互数据库
- 小鼠中的CellChatDB包含2,021个经过验证的分子相互作用,其中包括60%的分泌自分泌/旁分泌信号相互作用,21%的细胞外基质(ECM)-receptor相互作用和19%的细胞细胞接触相互作用。
- 人类中的CellChatDB包含1,939个经过验证的分子相互作用,其中包括61.8%的旁分泌/自分泌信号相互作用,21.7%的细胞外基质(ECM)-recepto相互作用和16.5%的细胞细胞接触相互作用。
此外CellChatDB也可以手动添加自己的受体-配体(ligand-receptor)
CellChatDB <- CellChatDB.human # use CellChatDB.mouse if running on mouse data
# showDatabaseCategory(CellChatDB)
# dplyr::glimpse(CellChatDB$interaction)
# use a subset of CellChatDB for cell-cell communication analysis
# use all CellChatDB for cell-cell communication analysis
# CellChatDB.use <- CellChatDB # simply use the default CellChatDB
# set the used database in the object
CellChatDB.use <- subsetDB(CellChatDB, search = "Secreted Signaling") # use Secreted Signaling
cellchat@DB <- CellChatDB.use
> str(CellChatDB) List of 4 $ interaction:'data.frame': 1939 obs. of 11 variables: ..$ interaction_name : chr [1:1939] "TGFB1_TGFBR1_TGFBR2" "TGFB2_TGFBR1_TGFBR2" "TGFB3_TGFBR1_TGFBR2" "TGFB1_ACVR1B_TGFBR2" ... ..$ pathway_name : chr [1:1939] "TGFb" "TGFb" "TGFb" "TGFb" ... ..$ ligand : chr [1:1939] "TGFB1" "TGFB2" "TGFB3" "TGFB1" ... ..$ receptor : chr [1:1939] "TGFbR1_R2" "TGFbR1_R2" "TGFbR1_R2" "ACVR1B_TGFbR2" ... ..$ agonist : chr [1:1939] "TGFb agonist" "TGFb agonist" "TGFb agonist" "TGFb agonist" ... ..$ antagonist : chr [1:1939] "TGFb antagonist" "TGFb antagonist" "TGFb antagonist" "TGFb antagonist" ... ..$ co_A_receptor : chr [1:1939] "" "" "" "" ... ..$ co_I_receptor : chr [1:1939] "TGFb inhibition receptor" "TGFb inhibition receptor" "TGFb inhibition receptor" "TGFb inhibition receptor" ... ..$ evidence : chr [1:1939] "KEGG: hsa04350" "KEGG: hsa04350" "KEGG: hsa04350" "PMID: 27449815" ... ..$ annotation : chr [1:1939] "Secreted Signaling" "Secreted Signaling" "Secreted Signaling" "Secreted Signaling" ... ..$ interaction_name_2: chr [1:1939] "TGFB1 - (TGFBR1+TGFBR2)" "TGFB2 - (TGFBR1+TGFBR2)" "TGFB3 - (TGFBR1+TGFBR2)" "TGFB1 - (ACVR1B+TGFBR2)" ... $ complex :'data.frame': 157 obs. of 4 variables: ..$ subunit_1: chr [1:157] "INHBA" "INHA" "INHA" "IL12A" ... ..$ subunit_2: chr [1:157] "INHBB" "INHBA" "INHBB" "IL12B" ... ..$ subunit_3: chr [1:157] "" "" "" "" ... ..$ subunit_4: chr [1:157] "" "" "" "" ... $ cofactor :'data.frame': 31 obs. of 16 variables: ..$ cofactor1 : chr [1:31] "FST" "BAMBI" "TIE1" "PTPRB" ... ..$ cofactor2 : chr [1:31] "" "" "" "" ... ..$ cofactor3 : chr [1:31] "" "" "" "" ... ..$ cofactor4 : chr [1:31] "" "" "" "" ... ..$ cofactor5 : chr [1:31] "" "" "" "" ... ..$ cofactor6 : chr [1:31] "" "" "" "" ... ..$ cofactor7 : chr [1:31] "" "" "" "" ... ..$ cofactor8 : chr [1:31] "" "" "" "" ... ..$ cofactor9 : chr [1:31] "" "" "" "" ... ..$ cofactor10: chr [1:31] "" "" "" "" ... ..$ cofactor11: chr [1:31] "" "" "" "" ... ..$ cofactor12: chr [1:31] "" "" "" "" ... ..$ cofactor13: chr [1:31] "" "" "" "" ... ..$ cofactor14: chr [1:31] "" "" "" "" ... ..$ cofactor15: chr [1:31] "" "" "" "" ... ..$ cofactor16: chr [1:31] "" "" "" "" ... $ geneInfo :'data.frame': 41787 obs. of 6 variables: ..$ Symbol : chr [1:41787] "A1BG" "A1BG-AS1" "A1CF" "A2M" ... ..$ Name : chr [1:41787] "alpha-1-B glycoprotein" "A1BG antisense RNA 1" "APOBEC1 complementation factor" "alpha-2-macroglobulin" ... ..$ EntrezGene.ID : int [1:41787] 1 503538 29974 2 144571 144568 NA NA 3 127550 ... ..$ Ensembl.Gene.ID: chr [1:41787] "ENSG00000121410" "ENSG00000268895" "ENSG00000148584" "ENSG00000175899" ... ..$ MGI.ID : chr [1:41787] "MGI:2152878" "" "MGI:1917115" "MGI:2449119" ... ..$ Gene.group.name: chr [1:41787] "Immunoglobulin like domain containing" "Antisense RNAs" "RNA binding motif containing" "C3 and PZP like, alpha-2-macroglobulin domain containing" ...
基因表达矩阵预处理
为了推断细胞状态特异性的通信,我们在一个细胞组中识别过表达(over-expressed)的配体或受体,然后确定如果配体或受体过表达,则识别过表达的配体 - 受体相互作用。
这里可以通过函数
projectData
将基因表达数据映射到protein-protein interaction(PPI)
网络(Specifically, a diffusion process is used to smooth genes’ expression values based on their neighbors’ defined in a high-confidence experimentally validated protein-protein network)。当分析具有浅测序深度的单细胞数据时,此功能很有用,因为投影降低了信号基因的dropout effects
,特别是对于可能零表达配体/受体的亚基(subunits)。
cellchat <- subsetData(cellchat) # This step is necessary even if using the whole database
future::plan("multiprocess", workers = 4) # do parallel
cellchat <- identifyOverExpressedGenes(cellchat)
cellchat <- identifyOverExpressedInteractions(cellchat)
# project gene expression data onto PPI (Optional: when running it, USER should set `raw.use = FALSE` in the function `computeCommunProb()` in order to use the projected data)
# cellchat <- projectData(cellchat, PPI.human)
推断细胞通讯网络
Cellchat通过( assigning each interaction with a probability value and peforming a permutation test)推断生物学上细胞间通讯的显著性。Cellchat通过使用law of mass action
,将基因表达与信号配体、受体及其辅因子之间相互作用的先验知识相结合,对细胞间通信的概率进行建模。
推断的配体-受体对的数量显然取决于计算每个细胞组平均基因表达的方法。默认情况下,CellChat使用一种称为“trimean”的统计稳健平均方法,与其他方法相比,这种方法产生的交互更少。然而,我们发现CellChat在预测更强的交互方面表现良好,这对缩小交互范围以进行进一步的实验验证非常有帮助。
在computeCommunProbe中,我们提供了使用其他方法的选项,例如5%和10%的截断平均值,来计算平均基因表达。值得注意的是,“trimean”近似于25%的截断平均值,这意味着如果一组中表达细胞的百分比低于25%,则平均基因表达为零。要使用10%的截断平均值,USER可以设置type=“truncatedMean”
和trim=0.1
。
computeAveExpr
函数可以帮助检查感兴趣的信号传导基因的平均表达。
computeAveExpr(cellchat,features=c(“CXCL12”,“CXCR4”, type=“truncatedMean”,trim=0.1)
APOE+ FIB FBN1+ FIB COL11A1+ FIB Inflam. FIB cDC1 cDC2 LC
CXCL12 2.455861 1.693046 0 0.7161257 0 0.0000000 0.000000
CXCR4 0.000000 0.000000 0 0.0000000 0 0.2798084 1.787762
Inflam. DC TC Inflam. TC CD40LG+ TC NKT
CXCL12 0.000000 0 0 0.0000000 0
CXCR4 1.325267 0 0 0.6740233 0
在分析未排序的单细胞转录组时,假设丰富的细胞群体往往比罕见的细胞群体集体发送更强的信号,CellChat也可以在概率计算中考虑每个细胞群体中细胞比例的影响。用户可以设置population.size=TRUE。
计算通讯概率并推断细胞通讯网络
如果所研究的生物过程中众所周知的信号传导途径没有被预测到,USER 可以尝试truncatedMean
来改变计算每个细胞组平均基因表达的方法。
cellchat <- computeCommunProb(cellchat)
# Filter out the cell-cell communication if there are only few number of cells in certain cell groups
cellchat <- filterCommunication(cellchat, min.cells = 10)
> str(cellchat) Formal class 'CellChat' [package "CellChat"] with 15 slots ..@ data.raw : num[0 , 0 ] ..@ data :Formal class 'dgCMatrix' [package "Matrix"] with 6 slots .. .. ..@ i : int [1:4119559] 0 1 6 13 15 16 17 22 24 30 ... .. .. ..@ p : int [1:5012] 0 4064 7504 10940 13718 16363 19325 22489 25141 27734 ... .. .. ..@ Dim : int [1:2] 17328 5011 .. .. ..@ Dimnames:List of 2 .. .. .. ..$ : chr [1:17328] "A1BG" "A1BG-AS1" "A2M" "A2M-AS1" ... .. .. .. ..$ : chr [1:5011] "S1_AACTCCCAGAGCTGCA" "S1_CAACCAATCCTCATTA" "S1_CGCTATCTCCTAGTGA" "S1_ATTTCTGCAGGACGTA" ... .. .. ..@ x : num [1:4119559] 0.577 0.577 0.941 1.207 0.941 ... .. .. ..@ factors : list() ..@ data.signaling:Formal class 'dgCMatrix' [package "Matrix"] with 6 slots .. .. ..@ i : int [1:110348] 5 9 10 11 22 23 31 45 48 53 ... .. .. ..@ p : int [1:5012] 0 107 192 295 366 443 525 613 686 752 ... .. .. ..@ Dim : int [1:2] 555 5011 .. .. ..@ Dimnames:List of 2 .. .. .. ..$ : chr [1:555] "ACKR1" "ACKR2" "ACKR3" "ACKR4" ... .. .. .. ..$ : chr [1:5011] "S1_AACTCCCAGAGCTGCA" "S1_CAACCAATCCTCATTA" "S1_CGCTATCTCCTAGTGA" "S1_ATTTCTGCAGGACGTA" ... .. .. ..@ x : num [1:110348] 0.577 0.577 0.577 0.577 1.207 ... .. .. ..@ factors : list() ..@ data.scale : num[0 , 0 ] ..@ data.project : num[0 , 0 ] ..@ images : list() ..@ net :List of 2 .. ..$ prob: num [1:12, 1:12, 1:656] 0 0 0 0 0 0 0 0 0 0 ... .. .. ..- attr(*, "dimnames")=List of 3 .. .. .. ..$ : chr [1:12] "APOE+ FIB" "FBN1+ FIB" "COL11A1+ FIB" "Inflam. FIB" ... .. .. .. ..$ : chr [1:12] "APOE+ FIB" "FBN1+ FIB" "COL11A1+ FIB" "Inflam. FIB" ... .. .. .. ..$ : chr [1:656] "TGFB1_TGFBR1_TGFBR2" "TGFB2_TGFBR1_TGFBR2" "TGFB3_TGFBR1_TGFBR2" "TGFB1_ACVR1B_TGFBR2" ... .. ..$ pval: num [1:12, 1:12, 1:656] 1 1 1 1 1 1 1 1 1 1 ... .. .. ..- attr(*, "dimnames")=List of 3 .. .. .. ..$ : chr [1:12] "APOE+ FIB" "FBN1+ FIB" "COL11A1+ FIB" "Inflam. FIB" ... .. .. .. ..$ : chr [1:12] "APOE+ FIB" "FBN1+ FIB" "COL11A1+ FIB" "Inflam. FIB" ... .. .. .. ..$ : chr [1:656] "TGFB1_TGFBR1_TGFBR2" "TGFB2_TGFBR1_TGFBR2" "TGFB3_TGFBR1_TGFBR2" "TGFB1_ACVR1B_TGFBR2" ... ..@ netP : list() ..@ meta :'data.frame': 5011 obs. of 3 variables: .. ..$ patient.id: chr [1:5011] "Patient1" "Patient1" "Patient1" "Patient1" ... .. ..$ condition : chr [1:5011] "LS" "LS" "LS" "LS" ... .. ..$ labels : Factor w/ 12 levels "APOE+ FIB","FBN1+ FIB",..: 4 2 4 4 4 4 2 4 2 2 ... ..@ idents : Factor w/ 12 levels "APOE+ FIB","FBN1+ FIB",..: 4 2 4 4 4 4 2 4 2 2 ... ..@ DB :List of 4 .. ..$ interaction:'data.frame': 1199 obs. of 11 variables: .. .. ..$ interaction_name : chr [1:1199] "TGFB1_TGFBR1_TGFBR2" "TGFB2_TGFBR1_TGFBR2" "TGFB3_TGFBR1_TGFBR2" "TGFB1_ACVR1B_TGFBR2" ... .. .. ..$ pathway_name : chr [1:1199] "TGFb" "TGFb" "TGFb" "TGFb" ... .. .. ..$ ligand : chr [1:1199] "TGFB1" "TGFB2" "TGFB3" "TGFB1" ... .. .. ..$ receptor : chr [1:1199] "TGFbR1_R2" "TGFbR1_R2" "TGFbR1_R2" "ACVR1B_TGFbR2" ... .. .. ..$ agonist : chr [1:1199] "TGFb agonist" "TGFb agonist" "TGFb agonist" "TGFb agonist" ... .. .. ..$ antagonist : chr [1:1199] "TGFb antagonist" "TGFb antagonist" "TGFb antagonist" "TGFb antagonist" ... .. .. ..$ co_A_receptor : chr [1:1199] "" "" "" "" ... .. .. ..$ co_I_receptor : chr [1:1199] "TGFb inhibition receptor" "TGFb inhibition receptor" "TGFb inhibition receptor" "TGFb inhibition receptor" ... .. .. ..$ evidence : chr [1:1199] "KEGG: hsa04350" "KEGG: hsa04350" "KEGG: hsa04350" "PMID: 27449815" ... .. .. ..$ annotation : chr [1:1199] "Secreted Signaling" "Secreted Signaling" "Secreted Signaling" "Secreted Signaling" ... .. .. ..$ interaction_name_2: chr [1:1199] "TGFB1 - (TGFBR1+TGFBR2)" "TGFB2 - (TGFBR1+TGFBR2)" "TGFB3 - (TGFBR1+TGFBR2)" "TGFB1 - (ACVR1B+TGFBR2)" ... .. ..$ complex :'data.frame': 157 obs. of 4 variables: .. .. ..$ subunit_1: chr [1:157] "INHBA" "INHA" "INHA" "IL12A" ... .. .. ..$ subunit_2: chr [1:157] "INHBB" "INHBA" "INHBB" "IL12B" ... .. .. ..$ subunit_3: chr [1:157] "" "" "" "" ... .. .. ..$ subunit_4: chr [1:157] "" "" "" "" ... .. ..$ cofactor :'data.frame': 31 obs. of 16 variables: .. .. ..$ cofactor1 : chr [1:31] "FST" "BAMBI" "TIE1" "PTPRB" ... .. .. ..$ cofactor2 : chr [1:31] "" "" "" "" ... .. .. ..$ cofactor3 : chr [1:31] "" "" "" "" ... .. .. ..$ cofactor4 : chr [1:31] "" "" "" "" ... .. .. ..$ cofactor5 : chr [1:31] "" "" "" "" ... .. .. ..$ cofactor6 : chr [1:31] "" "" "" "" ... .. .. ..$ cofactor7 : chr [1:31] "" "" "" "" ... .. .. ..$ cofactor8 : chr [1:31] "" "" "" "" ... .. .. ..$ cofactor9 : chr [1:31] "" "" "" "" ... .. .. ..$ cofactor10: chr [1:31] "" "" "" "" ... .. .. ..$ cofactor11: chr [1:31] "" "" "" "" ... .. .. ..$ cofactor12: chr [1:31] "" "" "" "" ... .. .. ..$ cofactor13: chr [1:31] "" "" "" "" ... .. .. ..$ cofactor14: chr [1:31] "" "" "" "" ... .. .. ..$ cofactor15: chr [1:31] "" "" "" "" ... .. .. ..$ cofactor16: chr [1:31] "" "" "" "" ... .. ..$ geneInfo :'data.frame': 41787 obs. of 6 variables: .. .. ..$ Symbol : chr [1:41787] "A1BG" "A1BG-AS1" "A1CF" "A2M" ... .. .. ..$ Name : chr [1:41787] "alpha-1-B glycoprotein" "A1BG antisense RNA 1" "APOBEC1 complementation factor" "alpha-2-macroglobulin" ... .. .. ..$ EntrezGene.ID : int [1:41787] 1 503538 29974 2 144571 144568 NA NA 3 127550 ... .. .. ..$ Ensembl.Gene.ID: chr [1:41787] "ENSG00000121410" "ENSG00000268895" "ENSG00000148584" "ENSG00000175899" ... .. .. ..$ MGI.ID : chr [1:41787] "MGI:2152878" "" "MGI:1917115" "MGI:2449119" ... .. .. ..$ Gene.group.name: chr [1:41787] "Immunoglobulin like domain containing" "Antisense RNAs" "RNA binding motif containing" "C3 and PZP like, alpha-2-macroglobulin domain containing" ... ..@ LR :List of 1 .. ..$ LRsig:'data.frame': 656 obs. of 11 variables: .. .. ..$ interaction_name : chr [1:656] "TGFB1_TGFBR1_TGFBR2" "TGFB2_TGFBR1_TGFBR2" "TGFB3_TGFBR1_TGFBR2" "TGFB1_ACVR1B_TGFBR2" ... .. .. ..$ pathway_name : chr [1:656] "TGFb" "TGFb" "TGFb" "TGFb" ... .. .. ..$ ligand : chr [1:656] "TGFB1" "TGFB2" "TGFB3" "TGFB1" ... .. .. ..$ receptor : chr [1:656] "TGFbR1_R2" "TGFbR1_R2" "TGFbR1_R2" "ACVR1B_TGFbR2" ... .. .. ..$ agonist : chr [1:656] "TGFb agonist" "TGFb agonist" "TGFb agonist" "TGFb agonist" ... .. .. ..$ antagonist : chr [1:656] "TGFb antagonist" "TGFb antagonist" "TGFb antagonist" "TGFb antagonist" ... .. .. ..$ co_A_receptor : chr [1:656] "" "" "" "" ... .. .. ..$ co_I_receptor : chr [1:656] "TGFb inhibition receptor" "TGFb inhibition receptor" "TGFb inhibition receptor" "TGFb inhibition receptor" ... .. .. ..$ evidence : chr [1:656] "KEGG: hsa04350" "KEGG: hsa04350" "KEGG: hsa04350" "PMID: 27449815" ... .. .. ..$ annotation : chr [1:656] "Secreted Signaling" "Secreted Signaling" "Secreted Signaling" "Secreted Signaling" ... .. .. ..$ interaction_name_2: chr [1:656] "TGFB1 - (TGFBR1+TGFBR2)" "TGFB2 - (TGFBR1+TGFBR2)" "TGFB3 - (TGFBR1+TGFBR2)" "TGFB1 - (ACVR1B+TGFBR2)" ... ..@ var.features :List of 2 .. ..$ features : chr [1:1146] "CXCL12" "DCN" "RARRES2" "FGF7" ... .. ..$ features.info:'data.frame': 1146 obs. of 7 variables: .. .. ..$ clusters : chr [1:1146] "APOE+ FIB" "APOE+ FIB" "APOE+ FIB" "APOE+ FIB" ... .. .. ..$ features : chr [1:1146] "CXCL12" "DCN" "RARRES2" "FGF7" ... .. .. ..$ pvalues : num [1:1146] 4.95e-271 7.59e-186 4.70e-162 2.21e-138 2.46e-129 ... .. .. ..$ logFC : num [1:1146] 1.643 0.764 1.413 1.367 1.237 ... .. .. ..$ pct.1 : num [1:1146] 0.691 0.899 0.376 0.449 0.542 0.612 0.55 0.173 0.608 0.233 ... .. .. ..$ pct.2 : num [1:1146] 0.198 0.392 0.07 0.132 0.195 0.293 0.218 0.015 0.325 0.046 ... .. .. ..$ pvalues.adj: num [1:1146] 2.75e-268 4.21e-183 2.61e-159 1.23e-135 1.37e-126 ... ..@ dr : list() ..@ options :List of 4 .. ..$ mode : chr "single" .. ..$ datatype : chr "RNA" .. ..$ run.time : num 108 .. ..$ parameter:List of 13 .. .. ..$ type.mean : chr "triMean" .. .. ..$ trim : num 0.1 .. .. ..$ raw.use : logi TRUE .. .. ..$ population.size : logi FALSE .. .. ..$ nboot : num 100 .. .. ..$ seed.use : int 1 .. .. ..$ Kh : num 0.5 .. .. ..$ n : num 1 .. .. ..$ distance.use : NULL .. .. ..$ interaction.length: NULL .. .. ..$ spot.size : NULL .. .. ..$ spot.size.fullres : NULL .. .. ..$ k.min : NULL
提取推断的细胞通讯网络转换为数据框
Cellchat提供了一个函数subsetCommunication
用于容易的访问推断的感兴趣的细胞通讯。
df.net <- subsetCommunication(cellchat)
: 返回在受体/配体水平上推断的细胞-细胞通讯的数据框。设slot.name="netP"
以访问推断的信号通路水平的通讯。df.net <- subsetCommunication(cellchat, sources.use = c(1,2), targets.use = c(4,5))
:给出了从细胞组1和2向向细胞组4和5发送的推断的细胞-细胞间的通讯。df.net <- subsetCommunication(cellchat, signaling = c("WNT", "TGFb"))
:给出了由信号WNT和TGFb介导的推断的细胞-细胞间通讯。
推断信号通路水平的细胞-细胞间通讯
CellChat通过总结与每个信号通路相关的所有配体-受体相互作用的通信概率,计算信号通路水平上的通信概率。
注意:每个配体-受体对和每个信号通路的推断细胞间通信网络分别存储在“net”和“netP”槽中。
cellchat <- computeCommunProbPathway(cellchat)
计算聚合的细胞通讯网络
通过对链接的数量或通讯概率的总结,计算聚合的细胞通讯网络。用户还可以通过设置sources.use
和targets.use
来计算cell group的子集之间的聚合网络。
cellchat <- aggregateNet(cellchat)
可视化聚集细胞通讯网络。例如使用circle plot
显示任意两组细胞之间的相互作用次数或总相互作用强度。
groupSize <- as.numeric(table(cellchat@idents))
par(mfrow = c(1,2), xpd=TRUE)
netVisual_circle(cellchat@net$count, vertex.weight = groupSize, weight.scale = T, label.edge= F, title.name = "Number of interactions")
netVisual_circle(cellchat@net$weight, vertex.weight = groupSize, weight.scale = T, label.edge= F, title.name = "Interaction weights/strength")
由于细胞通讯网络的复杂性,我们可以检查每个cell group发出的信号。我们还控制参数edge.weight.max
,以便比较不同网络之间的边的权重。
mat <- cellchat@net$weight
par(mfrow = c(3,4), xpd=TRUE)
for (i in 1:nrow(mat)) {
mat2 <- matrix(0, nrow = nrow(mat), ncol = ncol(mat), dimnames = dimnames(mat))
mat2[i, ] <- mat[i, ]
netVisual_circle(mat2, vertex.weight = groupSize, weight.scale = T, edge.weight.max = max(mat), title.name = rownames(mat)[i])
}
细胞通讯网络的可视化
信号通路可视化
细胞通讯可视化的方法包括:hierarchical plot
, circle plot
, Chord diagram
,bubble plot
。它可以随时预测cell群体的主要信号输入和输出,以及这些群体和信号如何协调在一起实现功能。
通过结合social network analysis
、pattern recognition
、manifold learning
的方法,可以对推断的细胞通讯网络进行定量的表征和比较。
hierarchical plot
使用参数vertex.receiver
指定需要展示的细胞类型,hierarchical plot
由两个部分组成,左侧对某些感兴趣的细胞类型(即定义的vertex.receiver)显示自分泌和旁分泌(autocrine and paracrine)信号,右侧向数据集中的其余细胞类型显示自分泌和旁分泌信号。
在这里,我们以一个信号通路的输入为例。cellchat@netP$pathways
可以访问显示重要通信的所有信号通路。
pathways.show <- c("CXCL")
# Hierarchy plot
# Here we define `vertex.receive` so that the left portion of the hierarchy plot shows signaling to fibroblast and the right portion shows signaling to immune cells
vertex.receiver = seq(1,4) # a numeric vector.
netVisual_aggregate(cellchat, signaling = pathways.show, vertex.receiver = vertex.receiver, layout = "hierarchy")
circle plot
par(mfrow=c(1,1))
netVisual_aggregate(cellchat, signaling = pathways.show, layout = "circle")
Chord diagram
CellChat提供了两个函数netVisual_chord_cell
(通过调整Circlize软件包中的不同参数来灵活地可视化信号网络)和netVisual_chord_gene
用于以不同的目的不同的级别可视化细胞通讯。netVisual_chord_cell
用于可视化不同细胞类型之间的细胞细胞通信(在和弦图中的每个扇区都是细胞类型);netvisual_chord_gene
用于可视化由多个配体受体或信号通路介导的细胞间通信(其中弦图中的每一个扇区是配体、受体或信号途径)。
par(mfrow=c(1,1))
netVisual_aggregate(cellchat, signaling = pathways.show, layout = "chord")
# group.cellType <- c(rep("FIB", 4), rep("DC", 4), rep("TC", 4)) # grouping cell clusters into fibroblast, DC and TC cells
# names(group.cellType) <- levels(cellchat@idents)
# netVisual_chord_cell(cellchat, signaling = pathways.show, group = group.cellType, title.name = paste0(pathways.show, " signaling network"))
netVisual_chord_gene(cellchat, signaling = pathways.show)
heatmap plot
netVisual_heatmap(cellchat, signaling = pathways.show, color.heatmap = "Reds",remove.isolate = T)
计算每个配体受体对对整体信号通路的贡献,并可视化由单个配体受体对介导的细胞 - 细胞通信、
netAnalysis_contribution(cellchat, signaling = pathways.show)
我们还可以看到由单个配体-受体对介导的细胞间通信。我们提供了一个函数extractEnrichedLR
,用于提取给定信号通路的所有重要相互作用(L-R对)和相关信号基因。
pairLR.CXCL <- extractEnrichedLR(cellchat, signaling = pathways.show, geneLR.return = FALSE)
LR.show <- pairLR.CXCL[1,] # show one ligand-receptor pair
# Hierarchy plot
vertex.receiver = seq(1,4) # a numeric vector
netVisual_individual(cellchat, signaling = pathways.show, pairLR.use = LR.show, vertex.receiver = vertex.receiver, layout = "hierarchy")
netVisual_individual(cellchat, signaling = pathways.show, pairLR.use = LR.show, layout = "circle")
可视化由多个配体受体或信号通路介导的细胞细胞通信
Bubble plot
使用函数netVisual_Bubble
显示从某些细胞类型到其他细胞类型的说有重要交互(L-R
对)。
netVisual_bubble(cellchat, sources.use = 4, targets.use = c(5:11), remove.isolate = FALSE)
使用函数
netVisual_bubble
,显示一些细胞类型sources.use
到另外一些细胞类型targets.use
显著交互的L-R对
netVisual_bubble(cellchat, sources.use = 4, targets.use = c(5:11), signaling = c("CCL","CXCL"), remove.isolate = FALSE)
pairLR.use <- extractEnrichedLR(cellchat, signaling = c("CCL","CXCL","FGF")) netVisual_bubble(cellchat, sources.use = c(3,4), targets.use = c(5:8), pairLR.use = pairLR.use, remove.isolate = TRUE)
Chord diagram
cellchay使用函数netVisual_chord_gene
绘制从一些细胞类型到另外一些细胞类型交互的L-R对。
netVisual_chord_gene(cellchat, sources.use = 4, targets.use = c(5:11), lab.cex = 0.5,legend.pos.y = 30)
显示所有交互的L-R对,从一个细胞类型到另一个细胞类型
使用小提琴图/点图绘制信号基因表达分析
使用seurat的包装函数plotGeneExpression
,绘制与L-R对或信号通络相关的信号基因的基因表达分布。
plotGeneExpression(cellchat, signaling = "CXCL")
·> 默认情况下,PlotGeneexpression仅显示与推断出的显着通信相关的信号基因的表达。可以通过
enriched.only = FALSE
显示所有。
细胞通讯网络的系统分析
识别细胞群的信号角色,以及主要的贡献信号
chellchat通过计算每个细胞群的多个网络中心的度量,可以识别细胞信号中的主要发送者、接受者、调解者和影响者
(dominant senders, receivers, mediators and influencers)。具体而言,我们使用了加权定向网络中的措施,包括分别为细胞间通信的主要主要发送者、接受者、调解者和影响者,分别确定了主导的主要发送者、接受者、调解者和影响者。在以权重作为计算通信概率的加权定向网络中,outdegree
(计算为来自细胞组的传出信号的通信概率之和)in-degree
(计算为传入信号到细胞组的通信概率之和)可以分别用于识别信号网络的主要细胞发送方和接收方。
计算和可视化网络中心分数
cellchat <- netAnalysis_computeCentrality(cellchat, slot.name = "netP") # the slot 'netP' means the inferred intercellular communication network of signaling pathways
# Visualize the computed centrality scores using heatmap, allowing ready identification of major signaling roles of cell groups
netAnalysis_signalingRole_network(cellchat, signaling = pathways.show, width = 8, height = 2.5, font.size = 10)
计算2D空间主要发送者(sources)和接受者(targets)
# Signaling role analysis on the aggregated cell-cell communication network from all signaling pathways
gg1 <- netAnalysis_signalingRole_scatter(cellchat)
#> Signaling role analysis on the aggregated cell-cell communication network from all signaling pathways
# Signaling role analysis on the cell-cell communication networks of interest
gg2 <- netAnalysis_signalingRole_scatter(cellchat, signaling = c("CXCL", "CCL"))
#> Signaling role analysis on the cell-cell communication network from user's input
gg1 + gg2
确定某些细胞群的传出或转入信号的最大贡献信号
# Signaling role analysis on the aggregated cell-cell communication network from all signaling pathways
ht1 <- netAnalysis_signalingRole_heatmap(cellchat, pattern = "outgoing")
ht2 <- netAnalysis_signalingRole_heatmap(cellchat, pattern = "incoming")
ht1 + ht2
确定全局通讯模式,探索多种细胞类型和信号通路如何协调一致
除了探索说个途径的详细信息外,一个重要的问题是多个细胞群体和信号通路如何协调。CellChat采用一种模式识别的方法来识别全局通信模式。
随着模式数量级的增加,可能会有冗余模式,因此很难解释通讯模式。通常档模式数量大于2时,它在生物学上具有意义。此外cellchat还提供了函数selectK
推断模式的数量。该函数基于NMF R包中已经实现的两个度量,包括Cophenetic
和Silhouette
。这两个度量都基于一致性矩阵的分层聚类来测量特定数量的模式的稳定性。对于一定数量的模式,适当数量的模式是Cophenetic和Silhouette值开始突然下降的模式。
为了将细胞群与其富集的信号通路直接联系起来,我们将W和H中的元素设置为零,如果它们小于1/R,其中R是潜在模式的数量。
识别和可视化分泌细胞的传出通信模式
输出模式揭示了发送细胞(即作为信号源的细胞)如何彼此协调,以及它们如何协调特定的信号通路来驱动通信。
为了直观地显示潜在模式与细胞群、配体-受体对或信号通路的关联,我们使用了river (alluvial) plots
。我们首先将W的每一行和H的每一列归一化为[0,1],然后如果W和H中的元素小于0.5,则将它们设置为零。这样的阈值化允许揭示与每个推断的模式相关联的最富集的细胞组和信号通路,即,每个细胞组或信号通路仅与一个推断的模式相关联。这些阈值矩阵W和H被用作创建alluvial
图的输入。
为了将细胞群与其富集的信号通路直接联系起来,我们将W和H中的元素设置为零,如果它们小于1/R,其中R是潜在模式的数量。通过使用不那么严格的阈值,可以获得与每个细胞组相关的更丰富的信号通路。使用通过将W乘以H计算的每个细胞组对每个信号通路的贡献分数,我们构建了一个点图,其中点大小与贡献分数成比例,以显示细胞组与其富集的信号通路之间的关联。USERS还可以降低参数截止值,以显示与每个细胞组相关的更丰富的信号通路。
library(NMF)
library(ggalluvial)
selectK(cellchat, pattern = "outgoing")
当传出模式的数量为3时,Cophenetic和Silhouette值都开始突然下降。
nPatterns = 3
cellchat <- identifyCommunicationPatterns(cellchat, pattern = "outgoing", k = nPatterns)
# river plot
netAnalysis_river(cellchat, pattern = "outgoing")
#> Please make sure you have load `library(ggalluvial)` when running this function
netAnalysis_dot(cellchat, pattern = "outgoing")
识别和可视化目标细胞的传入通信模式
传入模式显示了目标细胞(即作为信号接收器的细胞)如何相互协调,以及它们如何与某些信号通路协调以响应传入信号。
selectK(cellchat, pattern = "incoming")
当传入模式的数量为4时,Cophenetic
值开始下降。
nPatterns = 4
cellchat <- identifyCommunicationPatterns(cellchat, pattern = "incoming", k = nPatterns)
信号网络的流形和分类学习分析
此外,CellChat能够量化所有显着的信号通路之间的相似性,然后根据其细胞通信网络相似性进行分组。可以根据功能或结构相似性进行分组。
功能相似性:高度的功能相似性表明主要的发送者和接收者相似,可以解释为两个信号通路或两个配体-受体对表现出相似和/或冗余的作用。功能相似性分析需要两个数据集之间具有相同的细胞群体组成。
结构相似性:使用结构相似性来比较它们的信号网络结构,而不考虑发送者和接收者的相似性。
根据功能相似性识别信号组
cellchat <- computeNetSimilarity(cellchat, type = "functional")
cellchat <- netEmbedding(cellchat, type = "functional")
#> Manifold learning of the signaling networks for a single dataset
cellchat <- netClustering(cellchat, type = "functional")
#> Classification learning of the signaling networks for a single dataset
# Visualization in 2D-space
netVisual_embedding(cellchat, type = "functional", label.size = 3.5)
基于结构相似性识别信号组
cellchat <- computeNetSimilarity(cellchat, type = "structural")
cellchat <- netEmbedding(cellchat, type = "structural")
#> Manifold learning of the signaling networks for a single dataset
cellchat <- netClustering(cellchat, type = "structural")
#> Classification learning of the signaling networks for a single dataset
# Visualization in 2D-space
netVisual_embedding(cellchat, type = "structural", label.size = 3.5)
使用CellChat对多个数据集进行比较分析
前面的数据集都是针对一个数据集合(一个分组)的分析,接下来参考Comparison analysis of multiple datasets using CellChat介绍使用cellchat对多个数据集进行分析。
cellchat多细胞-细胞通讯网络的联合流形学习(manifold learning
)和定量比对(quantitative contrast
)来识别主要的信号变化,以及保守的和上下文特异的信号。接下来使用两种不同生物条件下的单细胞数据展示其不同的功能。特应性皮炎患者的非残留(NL,正常)和病变(LS,患病)人类皮肤。这两个数据集(条件)在联合聚类后具有相同的细胞群体组成。如果不同数据集之间存在略有或截然不同的细胞种群组成,请查看Comparison analysis of multiple datasets with different cell type compositions
CellChat采用自上而下的方法,即从大局出发,然后对信号机制进行更详细的细化,以识别不同水平的信号变化,包括细胞-细胞通信的一般原理和功能失调的细胞群体/信号通路/配体受体。
library(CellChat) load("data_humanSkin_CellChat_1683686262168.rda") cellchatObj <- function(group){ data.input = data_humanSkin$data # normalized data matrix meta = data_humanSkin$meta # a dataframe with rownames containing cell mata data cell.use = rownames(meta)[meta$condition ==group] # extract the cell names from disease data data.input = data.input[, cell.use] meta = meta[cell.use, ] cellchat <- createCellChat(object = data.input, meta = meta, group.by = "labels") cellchat <- addMeta(cellchat, meta = meta) cellchat <- setIdent(cellchat, ident.use = "labels") # set "labels" as default cell identity levels(cellchat@idents) # show factor levels of the cell labels groupSize <- as.numeric(table(cellchat@idents)) # number of cells in each cell group CellChatDB <- CellChatDB.human # use CellChatDB.mouse if running on mouse data CellChatDB.use <- subsetDB(CellChatDB, search = "Secreted Signaling") # use Secreted Signaling cellchat@DB <- CellChatDB.use cellchat <- subsetData(cellchat) # This step is necessary even if using the whole database future::plan("multisession", workers = 4) # do parallel cellchat <- identifyOverExpressedGenes(cellchat) cellchat <- identifyOverExpressedInteractions(cellchat) cellchat <- computeCommunProb(cellchat) cellchat <- filterCommunication(cellchat, min.cells = 10) cellchat <- computeCommunProbPathway(cellchat) cellchat <- aggregateNet(cellchat) } cellchat.LS <- cellchatObj("LS") cellchat.NL <- cellchatObj("NL") object.list <- list(NL = cellchat.NL, LS = cellchat.LS) cellchat <- mergeCellChat(object.list, add.names = names(object.list))
比较相互作用的总数和相互作用强度
gg1 <- compareInteractions(cellchat, show.legend = F, group = c(1,2))
gg2 <- compareInteractions(cellchat, show.legend = F, group = c(1,2), measure = "weight")
gg1 + gg2
比较不同细胞群间的相互作用次数和相互作用强度
为了确定哪些细胞群体显示出显著的变化,CellChat 比较了不同细胞群体之间的交互数量和交互强度。
不同细胞群体(populations)之间的相互作用数量或相互作用强度的差异
在细胞-细胞通信网络中,两个数据集之间的相互作用数量或相互作用强度的差异可以用圆图来显示,在第二个数据集相比第一个中红色(或蓝色)边代表增加(或减少)信号。
par(mfrow = c(1,2), xpd=TRUE)
netVisual_diffInteraction(cellchat, weight.scale = T)
netVisual_diffInteraction(cellchat, weight.scale = T, measure = "weight")
我们还可以使用热图在更详细的细节中显示不同数量的相互作用或相互作用强度。顶部彩色条形图表示热图(传入信号)中显示的值的列之和。右边的彩色条形图表示值行之和(传出信令)。在颜色栏中,与第一个数据集相比,红色(或蓝色)表示第二个数据集中信号增加(或减少)。
gg1 <- netVisual_heatmap(cellchat)
#> Do heatmap based on a merged object
gg2 <- netVisual_heatmap(cellchat, measure = "weight")
#> Do heatmap based on a merged object
gg1 + gg2
差异网络分析只适用于成对数据集。如果有更多的数据集进行比较,我们可以直接显示每个数据集中任意两个细胞群之间的交互数量或交互强度。
weight.max <- getMaxWeight(object.list, attribute = c("idents","count"))
par(mfrow = c(1,2), xpd=TRUE)
for (i in 1:length(object.list)) {
netVisual_circle(object.list[[i]]@net$count, weight.scale = T, label.edge= F, edge.weight.max = weight.max[2], edge.width.max = 12, title.name = paste0("Number of interactions - ", names(object.list)[i]))
}
不同细胞类型(type)之间的相互作用数量或相互作用强度的差异
为了简化复杂的网络,在细胞类型层次上深入了解细胞间通信,我们可以基于已定义的细胞群聚合细胞间通信。在这里,我们将细胞群分为三种细胞类型,然后重新合并 CellChat 对象的列表。
group.cellType <- c(rep("FIB", 4), rep("DC", 4), rep("TC", 4))
group.cellType <- factor(group.cellType, levels = c("FIB", "DC", "TC"))
object.list <- lapply(object.list, function(x) {mergeInteractions(x, group.cellType)})
cellchat <- mergeCellChat(object.list, add.names = names(object.list))
同样,我们也可以用circle plot
来显示任意两种细胞类型之间的相互作用数量或相互作用强度的差异。与第一个数据集相比,第二个数据集中的红(或蓝)色边表示信号增加(或减少)。
比较2D空间中的主要来源和目标
在二维空间中比较输出和输入的相互作用强度可以准确识别不同数据集之间发送或接收信号有显著变化的细胞群。
num.link <- sapply(object.list, function(x) {rowSums(x@net$count) + colSums(x@net$count)-diag(x@net$count)})
weight.MinMax <- c(min(num.link), max(num.link)) # control the dot size in the different datasets
gg <- list()
for (i in 1:length(object.list)) {
gg[[i]] <- netAnalysis_signalingRole_scatter(object.list[[i]], title = names(object.list)[i], weight.MinMax = weight.MinMax)
}
#> Signaling role analysis on the aggregated cell-cell communication network from all signaling pathways
#> Signaling role analysis on the aggregated cell-cell communication network from all signaling pathways
patchwork::wrap_plots(plots = gg)
从散点图可以看出,Inflam.DC 和 cDC1是 LS 与 NL 相比的主要来源和目标。成纤维细胞群也成为 LS 的主要来源。
识别与一个细胞群相关的信号通路变化
gg1 <- netAnalysis_signalingChanges_scatter(cellchat, idents.use = "Inflam. DC", signaling.exclude = "MIF")
#> Visualizing differential outgoing and incoming signaling changes from NL to LS
#> The following `from` values were not present in `x`: 0
#> The following `from` values were not present in `x`: 0, -1
gg2 <- netAnalysis_signalingChanges_scatter(cellchat, idents.use = "cDC1", signaling.exclude = c("MIF"))
#> Visualizing differential outgoing and incoming signaling changes from NL to LS
#> The following `from` values were not present in `x`: 0, 2
#> The following `from` values were not present in `x`: 0, -1
patchwork::wrap_plots(plots = list(gg1,gg2))
识别保守的和上下文特定的信号传导途径
CellChat 可以识别具有较大(或较小)差异的信号传导网络、信号传导组以及基于其在多种生物条件下的细胞-细胞通信网络的保守的和上下文特定的信号传导途径。
根据信号群的功能/结构相似性,识别具有较大(或较小)差异的信号网络和信号群
CellChat 根据通信网络的功能和拓扑相似性对其进行联合流形学习和分类。注意: 这种分析适用于两个以上的数据集。
根据信号群的功能相似性来识别信号群
cellchat <- computeNetSimilarityPairwise(cellchat, type = "functional")
#> Compute signaling network similarity for datasets 1 2
cellchat <- netEmbedding(cellchat, type = "functional")
#> Manifold learning of the signaling networks for datasets 1 2
cellchat <- netClustering(cellchat, type = "functional")
#> Classification learning of the signaling networks for datasets 1 2
# Visualization in 2D-space
netVisual_embeddingPairwise(cellchat, type = "functional", label.size = 3.5)
#> 2D visualization of signaling networks from datasets 1 2
基于结构相似度的信号群识别
cellchat <- computeNetSimilarityPairwise(cellchat, type = "structural")
#> Compute signaling network similarity for datasets 1 2
cellchat <- netEmbedding(cellchat, type = "structural")
#> Manifold learning of the signaling networks for datasets 1 2
cellchat <- netClustering(cellchat, type = "structural")
#> Classification learning of the signaling networks for datasets 1 2
# Visualization in 2D-space
netVisual_embeddingPairwise(cellchat, type = "structural", label.size = 3.5)
#> 2D visualization of signaling networks from datasets 1 2
netVisual_embeddingPairwiseZoomIn(cellchat, type = "structural", nCol = 2)
计算并可视化learned joint manifold
中的路径距离
我们可以根据信令网络在共享的二维空间中的欧几里得度量来识别出差异较大(或较小)的信令网络。较大的距离意味着两个数据集之间的通信网络在功能或结构相似性方面的差异较大。注意: 我们只计算两个数据集之间重叠信号通路的距离。这里不考虑那些只在一个数据集中识别的信号通路。如果有三个以上的数据集,那么可以通过在函数 rank 相似性中定义比较来进行成对比较。
rankSimilarity(cellchat, type = "functional")
识别和可视化保守和上下文特定的信号传导途径
通过比较每个信号通路的信息流/相互作用强度,我们可以通过在一个条件下与另一个条件相比改变它们的信息流来识别信号通路,(i)关闭,(ii)减少,(iii)打开或(iv)增加。
比较各个信号通路的整体信息流
我们可以通过简单地比较每个信号通路的信息流来识别保守的和上下文特异性的信号通路,其由推断网络中所有对细胞组之间的通信概率的总和(即网络中的总权重)定义。
这个条形图可以以叠加模式绘制,也可以不绘制。根据 NL 和 LS 皮肤之间推断的网络中总体信息流的差异,对重要的信号通路进行了排序。红色的顶端信号通路在 NL 皮肤中富集,而这些绿色的信号通路在 LS 皮肤中富集。
gg1 <- rankNet(cellchat, mode = "comparison", stacked = T, do.stat = TRUE)
gg2 <- rankNet(cellchat, mode = "comparison", stacked = F, do.stat = TRUE)
gg1 + gg2
比较与每个细胞群相关的传出(或传入)信号
上面的分析总结了来自传出和传入信令的信息。我们还可以比较两个数据集之间的传出(或传入)信号传导模式,以便识别表现出不同信号传导模式的信号传导途径/配体受体。
我们可以将来自不同数据集的所有已识别的信号通路结合起来,通过将传出信号和传入信号聚合在一起来并排比较它们,包括传出信号、传入信号和整体信号。注意: rankNet 也显示了整体信号传导的比较,但是它没有显示特定细胞群中的信号传导强度。
library(ComplexHeatmap)
i = 1
pathway.union <- union(object.list[[i]]@netP$pathways, object.list[[i+1]]@netP$pathways)
ht1 = netAnalysis_signalingRole_heatmap(object.list[[i]], pattern = "outgoing", signaling = pathway.union, title = names(object.list)[i], width = 5, height = 6)
ht2 = netAnalysis_signalingRole_heatmap(object.list[[i+1]], pattern = "outgoing", signaling = pathway.union, title = names(object.list)[i+1], width = 5, height = 6)
draw(ht1 + ht2, ht_gap = unit(0.5, "cm"))
ht1 = netAnalysis_signalingRole_heatmap(object.list[[i]], pattern = "incoming", signaling = pathway.union, title = names(object.list)[i], width = 5, height = 6, color.heatmap = "GnBu")
ht2 = netAnalysis_signalingRole_heatmap(object.list[[i+1]], pattern = "incoming", signaling = pathway.union, title = names(object.list)[i+1], width = 5, height = 6, color.heatmap = "GnBu")
draw(ht1 + ht2, ht_gap = unit(0.5, "cm"))
ht1 = netAnalysis_signalingRole_heatmap(object.list[[i]], pattern = "all", signaling = pathway.union, title = names(object.list)[i], width = 5, height = 6, color.heatmap = "OrRd")
ht2 = netAnalysis_signalingRole_heatmap(object.list[[i+1]], pattern = "all", signaling = pathway.union, title = names(object.list)[i+1], width = 5, height = 6, color.heatmap = "OrRd")
draw(ht1 + ht2, ht_gap = unit(0.5, "cm"))
识别上调和下调的信号传导配体-受体对
通过比较通信概率来识别功能失调的信号
我们可以比较一些细胞群和其他细胞群的配体-受体对介导的通讯概率。这可以通过在函数netVisual_bubble
中设置comparison
来完成。
netVisual_bubble(cellchat, sources.use = 4, targets.use = c(5:11), comparison = c(1, 2), angle.x = 45)
此外,我们可以识别上调(增加)和下调(减少)信号配体-受体对在一个数据集相比,其他数据集。这可以通过在函数netVisual_bubble
中指定max.dataset
和 min.dataset
来完成。增加的信令意味着这些信令有更高的通信概率(强度)在一个数据集相比,其他数据集。
gg1 <- netVisual_bubble(cellchat, sources.use = 4, targets.use = c(5:11), comparison = c(1, 2), max.dataset = 2, title.name = "Increased signaling in LS", angle.x = 45, remove.isolate = T)
#> Comparing communications on a merged object
gg2 <- netVisual_bubble(cellchat, sources.use = 4, targets.use = c(5:11), comparison = c(1, 2), max.dataset = 1, title.name = "Decreased signaling in LS", angle.x = 45, remove.isolate = T)
#> Comparing communications on a merged object
gg1 + gg2
注意:
bubble plot
中显示的配体-受体对可以通过signaling.LSIncreased = gg1$data
访问
利用差异表达分析识别功能失调的信号传导
上述识别上调和下调信号的方法是通过比较每个 L-R 对和每个细胞组的两个数据集之间的通信概率来完成的。或者,我们可以根据差异基因表达分析鉴定上调和下调的信号传导配体-受体对。具体而言,我们对每个细胞组进行两种生物学条件(即 NL 和 LS)之间的差异表达分析,然后基于发送细胞和受体细胞中配体的倍数变化获得上调和下调的信号传导。
# 定义一个正数据集,即与其他数据集相比具有正倍数变化的数据集
pos.dataset = "LS"
# 定义用于存储差异表达分析结果的字符名
features.name = pos.dataset
# 执行差异表达分析
cellchat <- identifyOverExpressedGenes(cellchat, group.dataset = "datasets", pos.dataset = pos.dataset, features.name = features.name, only.pos = FALSE, thresh.pc = 0.1, thresh.fc = 0.1, thresh.p = 1)
#> 使用合并的 CellChat 对象中的joint cell labels
# 将差异表达分析的结果映射到推断的细胞-细胞通讯上,以便于管理/子集感兴趣的配体-受体对
net <- netMappingDEG(cellchat, features.name = features.name)
# 用 LS 中上调的配体提取配体-受体对
net.up <- subsetCommunication(cellchat, net = net, datasets = "LS",ligand.logFC = 0.2, receptor.logFC = NULL)
# 在 NL 中提取具有上调的配体和上调的受体的配体-受体对,即在 LS 中下调
net.down <- subsetCommunication(cellchat, net = net, datasets = "NL",ligand.logFC = -0.1, receptor.logFC = -0.1)
由于 net.up 和 net.down 中的信号基因可能具有复杂的多亚基,我们可以进一步进行反卷积来获得单个信号基因。
gene.up <- extractGeneSubsetFromPair(net.up, cellchat)
gene.down <- extractGeneSubsetFromPair(net.down, cellchat)
然后我们使用bubble plot
或chord diagram
来显示上调和下调的信号配体-受体对。
pairLR.use.up = net.up[, "interaction_name", drop = F]
gg1 <- netVisual_bubble(cellchat, pairLR.use = pairLR.use.up, sources.use = 4, targets.use = c(5:11), comparison = c(1, 2), angle.x = 90, remove.isolate = T,title.name = paste0("Up-regulated signaling in ", names(object.list)[2]))
#> Comparing communications on a merged object
pairLR.use.down = net.down[, "interaction_name", drop = F]
gg2 <- netVisual_bubble(cellchat, pairLR.use = pairLR.use.down, sources.use = 4, targets.use = c(5:11), comparison = c(1, 2), angle.x = 90, remove.isolate = T,title.name = paste0("Down-regulated signaling in ", names(object.list)[2]))
#> Comparing communications on a merged object
gg1 + gg2
使用弦图可视化上调和下调的信号传导配体-受体对
# Chord diagram
par(mfrow = c(1,2), xpd=TRUE)
netVisual_chord_gene(object.list[[2]], sources.use = 4, targets.use = c(5:11), slot.name = 'net', net = net.up, lab.cex = 0.8, small.gap = 3.5, title.name = paste0("Up-regulated signaling in ", names(object.list)[2]))
netVisual_chord_gene(object.list[[1]], sources.use = 4, targets.use = c(5:11), slot.name = 'net', net = net.down, lab.cex = 0.8, small.gap = 3.5, title.name = paste0("Down-regulated signaling in ", names(object.list)[2]))
使用wordcloud
在一个条件下比较另一个条件下富集的配体、信号传导或配体-受体对
computeEnrichmentScore(net.down, species = 'human')
computeEnrichmentScore(net.up, species = 'human')
使用Hierarchy plot
、Circle plot
或Chord diagram
可视化地比较细胞间通信
类似于 CellChat 对单个数据集的分析,我们可以使用Hierarchy plot
、Circle plot
或Chord diagram
来可视化细胞-细胞通信网络。
在所有可视化图中,边颜色与作为发送者的源一致,边权重与交互强度成正比。较粗的边线表示信号较强。在Hierarchy plot
和Circle plot
中,圆的大小与每个细胞组中的细胞数成正比。在Hierarchy plot
图中,实心圆和开放圆分别代表源和目标。在Chord diagram
中,内部较薄的条形颜色表示从相应的外部条形图接收信号的目标,inner bar 尺寸与目标接收到的信号强度成正比,这样的inner bar有助于解释复杂的Chord diagram
。请注意,对于一些细胞组有一些inner bar没有任何Chord,请忽略它,因为这是一个问题,尚未由R包circlize
解决。
pathways.show <- c("CXCL")
weight.max <- getMaxWeight(object.list, slot.name = c("netP"), attribute = pathways.show) # control the edge weights across different datasets
par(mfrow = c(1,2), xpd=TRUE)
for (i in 1:length(object.list)) {
netVisual_aggregate(object.list[[i]], signaling = pathways.show, layout = "circle", edge.weight.max = weight.max[1], edge.width.max = 10, signaling.name = paste(pathways.show, names(object.list)[i]))
}
pathways.show <- c("CXCL")
par(mfrow = c(1,2), xpd=TRUE)
ht <- list()
for (i in 1:length(object.list)) {
ht[[i]] <- netVisual_heatmap(object.list[[i]], signaling = pathways.show, color.heatmap = "Reds",title.name = paste(pathways.show, "signaling ",names(object.list)[i]))
}
#> Do heatmap based on a single object
#>
#> Do heatmap based on a single object
ComplexHeatmap::draw(ht[[1]] + ht[[2]], ht_gap = unit(0.5, "cm"))
# Chord diagram
pathways.show <- c("CXCL")
par(mfrow = c(1,2), xpd=TRUE)
for (i in 1:length(object.list)) {
netVisual_aggregate(object.list[[i]], signaling = pathways.show, layout = "chord", signaling.name = paste(pathways.show, names(object.list)[i]))
}
比较不同数据集间信号传导基因的表达分布
我们可以使用 Seurat 包装函数plotGeneExpression
绘制与 L-R 对或信号通路相关的信号基因的基因表达分布。
cellchat@meta$datasets = factor(cellchat@meta$datasets, levels = c("NL", "LS")) # set factor level
plotGeneExpression(cellchat, signaling = "CXCL", split.by = "datasets", colors.ggplot = T)
小实验
library(CellChat) df <- tribble(~cell_1,~cell_2,~cell_3,~cell_4,~cell_5, 1,2,1,1,1, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0, 0,0,0,0,0) df <- as.data.frame(df) rownames(df) <- c("cell_1","cell_2","cell_3","cell_4","cell_5") df <- as.matrix(df) netVisual_circle(df, vertex.weight = 5, weight.scale = T, edge.weight.max = max(df),title.name = "Number of interactions")