前言
今天给大家分享一个宝藏R包:GOplot
包如其名,专作GO富集分析的可视化工具包。函数网上教程不少,但这里笔者已经列出了所有可修改的常用参数
,改色改大小排序自定义,整整齐齐。
百行不到的代码,直接上升一个level:
数据准备
下载包,加载数据,这里要留意数据的格式,这是把差异分析
的结果也纳入了的
# install.packages('GOplot')
library("GOplot")
data(EC)
head(EC$david)
head(EC$genelist)
circ <- circle_dat(EC$david, EC$genelist)
GOBar
若把order.by.zscore修改为T则改为取绝对值进行排序
GOBar(circ,
# category == 'BP',
display = 'multiple', # single
order.by.zscore = F, #order
title = 'Z-score coloured barplot',
zsc.col = c('yellow', 'black', 'cyan'))
GOBubble
进行了一步过滤,以减少重叠。若不过滤可直接将数据改为circ
# 筛选重叠
reduced_circ <- reduce_overlap(circ, overlap = 0.75)
GOBubble(reduced_circ, # circ
title = 'Bubble plot',
colour = c("#4DBBD5FF","#EFD500FF","#F39B7FFF"),
table.legend = T,
bg.col = T,
ID = T, # GO id
display = 'multiple',
labels = 3)
GOCircle
IDs <- c('GO:0007507', 'GO:0001568', 'GO:0001944', 'GO:0048729', 'GO:0048514', 'GO:0005886', 'GO:0008092', 'GO:0008047')
GOCircle(circ, nsub = IDs,
rad1 = 2,
rad2 = 3,
table.legend = T,
zsc.col = c('darkgoldenrod1', 'black', 'cyan1'),
lfc.col = c("#F39B7FFF","#4DBBD5FF"),
label.size = 5,
label.fontface = "bold")
GOChord
这里需要把数据转换成长格式
chord <- chord_dat(data = circ, genes = EC$genes, process = EC$process)
GOChord(chord,
title = "GOChord Plot",
space = 0.02,
gene.order = 'logFC',
gene.space = 0.25,
gene.size = 3,
nlfc = 1,
lfc.col = c('darkgoldenrod1', 'black', 'cyan1'),
lfc.min = -4,
lfc.max = 4,
border.size = 0.5,
process.label = 8)
GOHeat
GOHeat(chord[,-8], # 剔除FC列
nlfc = 0)
GOHeat(chord,
nlfc = 1, # 倒数第一列
fill.col = c("#4DBBD5FF","#EFD500FF","#F39B7FFF"))
第一个是剔除FC列,默认统计富集的功能数量
第二个是保留FC列,指定参照倒数第一列差异倍数
GOCluster
这里是按FC聚类,可以将clust.by指定成按term聚类
GOCluster(circ,
EC$process,
metric = "euclidean", # 欧式距离
clust.by = 'logFC', # 'term'
nlfc = F, # T为多列FC,F则为一列FC
lfc.col = c('darkgoldenrod1', 'black', 'cyan1'),
lfc.min = -3,
lfc.max = 3,
lfc.space = 0,
lfc.width = 0.5,
term.col = ggsci::pal_frontiers(alpha = 0.7)(7), # max = 20
term.space = 0.5,
term.width = 2)
GOVenn
红色为上调基因个数,紫色为下调基因个数,灰色则为交集后上下调冲突的基因个数
l1 <- subset(circ, term == 'heart development', c(genes,logFC))
l2 <- subset(circ, term == 'plasma membrane', c(genes,logFC))
l3 <- subset(circ, term == 'tissue morphogenesis', c(genes,logFC))
GOVenn(l1,l2,l3,
label = c('heart development', 'plasma membrane', 'tissue morphogenesis'),
lfc.col = c("#EE0000FF","#808180FF","#5F559BFF"),
circle.col = c("#EFD500FF","#4DBBD5FF","#F39B7FFF"),
plot = T)
shiny
这里作者还做了个venn的shiny交互网站,有兴趣的小伙伴可以试一试:https://wwalter.shinyapps.io/Venn/
小总结
告别ggplot2手画,GO富集
的红蓝气泡图、柱状图已经审美疲劳了。这个GOplot应该是懒癌晚期拯救者了,都是打包好的函数直接换数据运行即可,记得文献引用上:Walter, Wencke, Fátima Sánchez-Cabo, and Mercedes Ricote. "GOplot: an R package for visually combining expression data with functional analysis." Bioinformatics (2015): btv300.