前言
今天给大家分享一个宝藏R包:GOplot
包如其名,专作GO富集分析的可视化工具包。函数网上教程不少,但这里笔者已经列出了所有可修改的常用参数
,改色改大小排序自定义,整整齐齐。
百行不到的代码,直接上升一个level:
数据准备
下载包,加载数据,这里要留意数据的格式,这是把差异分析
的结果也纳入了的
# install.packages('GOplot') library("GOplot") data(EC) head(EC$david) head(EC$genelist) circ <- circle_dat(EC$david, EC$genelist)
GOBar
若把order.by.zscore修改为T则改为取绝对值进行排序
GOBar(circ, # category == 'BP', display = 'multiple', # single order.by.zscore = F, #order title = 'Z-score coloured barplot', zsc.col = c('yellow', 'black', 'cyan'))
GOBubble
进行了一步过滤,以减少重叠。若不过滤可直接将数据改为circ
# 筛选重叠 reduced_circ <- reduce_overlap(circ, overlap = 0.75) GOBubble(reduced_circ, # circ title = 'Bubble plot', colour = c("#4DBBD5FF","#EFD500FF","#F39B7FFF"), table.legend = T, bg.col = T, ID = T, # GO id display = 'multiple', labels = 3)
GOCircle
IDs <- c('GO:0007507', 'GO:0001568', 'GO:0001944', 'GO:0048729', 'GO:0048514', 'GO:0005886', 'GO:0008092', 'GO:0008047') GOCircle(circ, nsub = IDs, rad1 = 2, rad2 = 3, table.legend = T, zsc.col = c('darkgoldenrod1', 'black', 'cyan1'), lfc.col = c("#F39B7FFF","#4DBBD5FF"), label.size = 5, label.fontface = "bold")
GOChord
这里需要把数据转换成长格式
chord <- chord_dat(data = circ, genes = EC$genes, process = EC$process) GOChord(chord, title = "GOChord Plot", space = 0.02, gene.order = 'logFC', gene.space = 0.25, gene.size = 3, nlfc = 1, lfc.col = c('darkgoldenrod1', 'black', 'cyan1'), lfc.min = -4, lfc.max = 4, border.size = 0.5, process.label = 8)
GOHeat
GOHeat(chord[,-8], # 剔除FC列 nlfc = 0) GOHeat(chord, nlfc = 1, # 倒数第一列 fill.col = c("#4DBBD5FF","#EFD500FF","#F39B7FFF"))
第一个是剔除FC列,默认统计富集的功能数量
第二个是保留FC列,指定参照倒数第一列差异倍数
GOCluster
这里是按FC聚类,可以将clust.by指定成按term聚类
GOCluster(circ, EC$process, metric = "euclidean", # 欧式距离 clust.by = 'logFC', # 'term' nlfc = F, # T为多列FC,F则为一列FC lfc.col = c('darkgoldenrod1', 'black', 'cyan1'), lfc.min = -3, lfc.max = 3, lfc.space = 0, lfc.width = 0.5, term.col = ggsci::pal_frontiers(alpha = 0.7)(7), # max = 20 term.space = 0.5, term.width = 2)
GOVenn
红色为上调基因个数,紫色为下调基因个数,灰色则为交集后上下调冲突的基因个数
l1 <- subset(circ, term == 'heart development', c(genes,logFC)) l2 <- subset(circ, term == 'plasma membrane', c(genes,logFC)) l3 <- subset(circ, term == 'tissue morphogenesis', c(genes,logFC)) GOVenn(l1,l2,l3, label = c('heart development', 'plasma membrane', 'tissue morphogenesis'), lfc.col = c("#EE0000FF","#808180FF","#5F559BFF"), circle.col = c("#EFD500FF","#4DBBD5FF","#F39B7FFF"), plot = T)
shiny
这里作者还做了个venn的shiny交互网站,有兴趣的小伙伴可以试一试:https://wwalter.shinyapps.io/Venn/
小总结
告别ggplot2手画,GO富集
的红蓝气泡图、柱状图已经审美疲劳了。这个GOplot应该是懒癌晚期拯救者了,都是打包好的函数直接换数据运行即可,记得文献引用上:Walter, Wencke, Fátima Sánchez-Cabo, and Mercedes Ricote. "GOplot: an R package for visually combining expression data with functional analysis." Bioinformatics (2015): btv300.