本文首发于“生信补给站”公众号 https://mp.weixin.qq.com/s/8kz2oKvUQrCR2_HWYXQT4g
如果有maf格式的文件,可以直接oncoplot包绘制瀑布图,有多种展示和统计maftools | 从头开始绘制发表级oncoplot(瀑布图)和maftools|TCGA肿瘤突变数据的汇总,分析和可视化,如果只有多个样本的基因突变与否的excel,不用担心,也可以用complexheatmap包绘制。
这个包功能很强大,本次只简单的介绍如何绘制基因组景观图(瀑布图)。
一 载入R包,数据
#if (!requireNamespace("BiocManager", quietly = TRUE)) # install.packages("BiocManager") #BiocManager::install("ComplexHeatmap") #install.packages("openxlsx") #install.packages("circlize") #后面直接加载即可 library(openxlsx) library(ComplexHeatmap) library(circlize) #读入数据 mut <- read.xlsx("TCGA_data.xlsx",sheet = "突变信息") cli <- read.xlsx("TCGA_data.xlsx",sheet = "临床信息")
查看变异数据
rownames(mut) <- mut$sample mat <- mut[,-1] mat[is.na(mat)]<-"" mat[1:6,1:6]
二 绘制oncoplot图
2.0 绘制“初始”瀑布图
oncoPrint(mat)
可以展示结果,但是为了paper,还需要一些调整!
2.1 指定变异类型的颜色和形状大小
#指定颜色, 调整颜色代码即可 col <- c( "mutation" = "blue" , "indel" = "green") #指定变异的样子,x,y,w,h代表变异的位置(x,y)和宽度(w),高度(h) alter_fun <- list( background = function(x, y, w, h) { grid.rect(x, y, w-unit(0.5, "mm"), h-unit(0.5, "mm"), gp = gpar(fill = "#CCCCCC", col = NA)) }, mutation = function(x, y, w, h) { grid.rect(x, y, w-unit(0.5, "mm"), h-unit(0.5, "mm"), gp = gpar(fill = col["mutation"], col = NA)) }, indel = function(x, y, w, h) { grid.rect(x, y, w-unit(0.5, "mm"), h*0.33, gp = gpar(fill = col["indel"], col = NA)) } ) #指定变异类型的标签,和数据中的类型对应 heatmap_legend_param <- list(title = "Alternations", at = c("mutation","indel"), labels = c( "mutation","indel")) 绘制景观图 #设定标题 column_title <- "This is Oncoplot " #画图并去除无突变的样本和基因 oncoPrint(mat, alter_fun = alter_fun, col = col, column_title = column_title, heatmap_legend_param = heatmap_legend_param)
2.2 简单的调整
oncoPrint(mat, alter_fun = alter_fun, col = col, column_title = column_title, remove_empty_columns = TRUE, #去掉空列 remove_empty_rows = TRUE, #去掉空行 row_names_side = "left", #基因在左 pct_side = "right", heatmap_legend_param = heatmap_legend_param)
三 添加注释
3.1 添加临床注释信息
pdata <- cli head(pdata)
#对应患者 pdata <- subset(pdata,pdata$sampleID %in% colnames(mat)) mat <- mat[, pdata$sampleID] #定义注释信息 ha<-HeatmapAnnotation(Age=pdata$age, Gender=pdata$gender, GeneExp_Subtype = pdata$GeneExp_Subtype , censor = pdata$censor, os = pdata$os, show_annotation_name = TRUE, annotation_name_gp = gpar(fontsize = 7))
3.2 瀑布图 + 临床注释
oncoPrint(mat, bottom_annotation = ha, #注释信息在底部 alter_fun = alter_fun, col = col, remove_empty_columns = TRUE, #去掉空列 remove_empty_rows = TRUE, #去掉空行 column_title = column_title, heatmap_legend_param = heatmap_legend_param )
此处使用默认颜色注释,有时候会比较接近,且“变动”
3.3 自定义注释颜色以及样本顺序
#自定义样本顺序 s <- pdata[order(pdata$censor,pdata$GeneExp_Subtype),] sample_order <- as.character(s$sampleID) #自定义颜色 #连续性变量设置颜色(外) col_os = colorRamp2(c(0, 4000), c("white", "red")) ha<-HeatmapAnnotation(Age=pdata$age, Gender=pdata$gender, GeneExp_Subtype = pdata$GeneExp_Subtype , censor = pdata$censor, os = pdata$os, #指定颜色 col = list(censor = c("death" = "red", "alive" = "blue"), GeneExp_Subtype = c("Classical" = "orange","Mesenchymal" = "green","Neural" = "skyblue" ), os = col_os), show_annotation_name = TRUE, annotation_name_gp = gpar(fontsize = 7)) 绘制瀑布图 oncoplot_anno = oncoPrint(mat,bottom_annotation = ha, alter_fun = alter_fun, col = col, column_order = sample_order, remove_empty_columns = TRUE, #去掉空列 remove_empty_rows = TRUE, #去掉空行 column_title = column_title, heatmap_legend_param = heatmap_legend_param) oncoplot_anno
注:颜色不一定好看,只是为了当默认的颜色比较接近时,或者有要求时候,可以自定义。
3.4 调整注释的位置
draw(oncoplot_anno ,annotation_legend_side = "bottom")
更改注释的位置,方便后续拼图需求。