funkyheatmap |临床+组学+分组数据可视化“神器”,时髦的热图

简介: funkyheatmap |临床+组学+分组数据可视化“神器”,时髦的热图

本文首发于“生信补给站”公众号   https://mp.weixin.qq.com/s/04FA3O4QVwg9CNtHBWbUYw


临床数据一般是使用图表汇总Table1的方式进行展示,例如R|tableone 快速绘制文章“表一”-基线特征三线表 或者 gtsummary|巧合-绘制多种数据汇总表“神器”

今天介绍一个可视化展示方式,funkyheatmap-R包 , 可以为基准数据生成热图式可视化的函数,可以使用列和行的注释对其进行微调 。效果如下


一 载入R包,数据


首先安装funkyheatmap 包,

1)先使用mtcars 数据绘制


#devtools::install_github("dynverse/dynbenchmark/package")
install.packages("funkyheatmap")
library(funkyheatmap)
library(dplyr, warn.conflicts = FALSE)
library(tibble, warn.conflicts = FALSE)
library(tidyverse)
data("mtcars")
funky_heatmap(mtcars)

使用mtcars类似的数据进行绘制即可,默认情况下绘制每一列的信息,后面就是参数修改以期达到封面图的效果。

下面使用dynbenchmark_data的数据进行详细的调试 绘制。

2)载入dynbenchmark_data数据(2019 NBT :A comparison of single-cell trajectory inference methods

data("dynbenchmark_data")
data <- dynbenchmark_data
head(data)

二 绘制funky heatmap


想达到封面图的效果,需要一系列的设置。

1 , 设置row_info 和 row_group

row_info选择待展示的行(此处即为id列各种方法),此处为全部展示,可是使用filter筛选想展示的行

row_groups是行group(此处为各种方法的类型),对应下图的红框部分

row_info <-
  data %>%
  select(group, id)
row_groups <-
  data %>%
  transmute(
    group,
    Group = case_when(
      group == "cycle" ~ "Cyclic methods",
      TRUE ~ paste0(stringr::str_to_title(group), " methods")
    )
  ) %>%
  unique()
head(row_info)
# A tibble: 6 × 2
#  group id                 
#  <fct> <chr>              
#1 graph paga               
#2 graph raceid_stemid      
#3 graph slicer             
#4 tree  slingshot          
#5 tree  paga_tree          
#6 tree  projected_slingshot
head(row_groups)
# A tibble: 6 × 2
#  group          Group                 
#  <fct>          <chr>                 
#1 graph          Graph methods         
#2 tree           Tree methods          
#3 multifurcation Multifurcation methods
#4 bifurcation    Bifurcation methods   
#5 linear         Linear methods        
#6 cycle          Cyclic methods 

2 ,设置column_info

设置列的信息,定义需要展示的列以及对应的一些属性信息


column_info <- tribble( # tribble_start
  ~group,                   ~id,                                            ~name,                      ~geom,        ~palette,         ~options,
  "method_characteristic",  "method_name",                                  "",                         "text",       NA,               list(hjust = 0, width = 6),
  "method_characteristic",  "method_platform",                              "Platform",                 "text",       NA,               list(width = 2),
  "method_characteristic",  "method_topology_inference",                    "Topology inference",       "text",       NA,               list(width = 2),
  "score_overall",          "summary_overall_overall",                      "Overall",                  "bar",        "overall",        list(width = 4, legend = FALSE),
  "score_overall",          "benchmark_overall_overall",                    "Accuracy",                 "bar",        "benchmark",      list(width = 4, legend = FALSE),
  "score_overall",          "qc_overall_overall",                           "Usability",                "bar",        "qc",             list(width = 4, legend = FALSE),
  "score_overall",          "control_label",                                "",                         "text",       NA,               list(overlay = TRUE),
  "benchmark_metric",       "benchmark_overall_norm_him",                   "Topology",                 "funkyrect",  "benchmark",      lst(),
  "benchmark_metric",       "benchmark_overall_norm_F1_branches",           "Branch assignment",        "funkyrect",  "benchmark",      lst(),
  "benchmark_metric",       "benchmark_overall_norm_correlation",           "Cell positions",           "funkyrect",  "benchmark",      lst(),
  "benchmark_metric",       "benchmark_overall_norm_featureimp_wcor",       "Features",                 "funkyrect",  "benchmark",      lst(),
  "benchmark_source",       "benchmark_source_real_gold",                   "Gold",                     "funkyrect",  "benchmark",      lst(),
  "benchmark_source",       "benchmark_source_real_silver",                 "Silver",                   "funkyrect",  "benchmark",      lst(),
  "benchmark_source",       "benchmark_source_synthetic_dyngen",            "dyngen",                   "funkyrect",  "benchmark",      lst(),
  "benchmark_source",       "benchmark_source_synthetic_dyntoy",            "dyntoy",                   "funkyrect",  "benchmark",      lst(),
  "benchmark_source",       "benchmark_source_synthetic_prosstt",           "PROSSTT",                  "funkyrect",  "benchmark",      lst(),
  "benchmark_source",       "benchmark_source_synthetic_splatter",          "Splatter",                 "funkyrect",  "benchmark",      lst(),
  "benchmark_execution",    "benchmark_overall_pct_errored_str",            "% Errored",                "text",       NA,               lst(hjust = 1),
  "benchmark_execution",    "benchmark_overall_error_reasons",              "Reason",                   "pie",        "error_reasons",  lst(),
  "scaling_predtime",       "scaling_pred_scoretime_cells1m_features100",   "1m \u00D7 100",            "rect",       "scaling",        lst(scale = FALSE),
  "scaling_predtime",       "scaling_pred_scoretime_cells1m_features100",   "",                         "text",       "white6black4",   lst(label = "scaling_pred_timestr_cells1m_features100", overlay = TRUE, size = 3, scale = FALSE),
  "scaling_predtime",       "scaling_pred_scoretime_cells100k_features1k",  "100k \u00D7 1k",           "rect",       "scaling",        lst(scale = FALSE),
  "scaling_predtime",       "scaling_pred_scoretime_cells100k_features1k",  "",                         "text",       "white6black4",   lst(label = "scaling_pred_timestr_cells100k_features1k", overlay = TRUE, size = 3, scale = FALSE),
  "scaling_predtime",       "scaling_pred_scoretime_cells10k_features10k",  "10k \u00D7 10k",           "rect",       "scaling",        lst(scale = FALSE),
  "scaling_predtime",       "scaling_pred_scoretime_cells10k_features10k",  "",                         "text",       "white6black4",   lst(label = "scaling_pred_timestr_cells10k_features10k", overlay = TRUE, size = 3, scale = FALSE),
  "scaling_predtime",       "scaling_pred_scoretime_cells1k_features100k",  "1k \u00D7 100k",           "rect",       "scaling",        lst(scale = FALSE),
  "scaling_predtime",       "scaling_pred_scoretime_cells1k_features100k",  "",                         "text",       "white6black4",   lst(label = "scaling_pred_timestr_cells1k_features100k", overlay = TRUE, size = 3, scale = FALSE),
  "scaling_predtime",       "scaling_pred_scoretime_cells100_features1m",   "100 \u00D7 1m",            "rect",       "scaling",        lst(scale = FALSE),
  "scaling_predtime",       "scaling_pred_scoretime_cells100_features1m",   "",                         "text",       "white6black4",   lst(label = "scaling_pred_timestr_cells100_features1m", overlay = TRUE, size = 3, scale = FALSE),
  "scaling_predtime",       "benchmark_overall_time_predcor_str",           "Cor. pred. vs. real",      "text",       NA,               lst(size = 3),
  "stability",              "stability_him",                                "Topology",                 "funkyrect",  "stability",      lst(),
  "stability",              "stability_F1_branches",                        "Branch assignment",        "funkyrect",  "stability",      lst(),
  "stability",              "stability_correlation",                        "Cell positions",           "funkyrect",  "stability",      lst(),
  "stability",              "stability_featureimp_wcor",                    "Features",                 "funkyrect",  "stability",      lst(),
  "qc_category",            "qc_cat_availability",                          "Availability",             "funkyrect",  "qc",             lst(),
  "qc_category",            "qc_cat_behaviour",                             "Behaviour",                "funkyrect",  "qc",             lst(),
  "qc_category",            "qc_cat_code_assurance",                        "Code assurance",           "funkyrect",  "qc",             lst(),
  "qc_category",            "qc_cat_code_quality",                          "Code quality",             "funkyrect",  "qc",             lst(),
  "qc_category",            "qc_cat_documentation",                         "Documentation",            "funkyrect",  "qc",             lst(),
  "qc_category",            "qc_cat_paper",                                 "Paper",                    "funkyrect",  "qc",             lst(),
  "qc_category",            "control_label",                                "",                         "text",       NA,               list(overlay = TRUE, width = -6)
) 

主要有以下几列:

group:列的分组;

id:data中的列名字;

name:图中展示的名字;

geom:集合图形,展示方式 (如果有多个属性(rect 和 text)需要展示则分为多行,如scaling_predtime);

palette:调色板信息;

3,设置column_groups

将上述column_info的列,根据对应的group ,设置group的 “Category”和大一级的 Experiment 信息。


column_groups <- tribble(
  ~Experiment,    ~Category,                                      ~group,                   ~palette,
  "Method",       "\n",                                           "method_characteristic",  "overall",
  "Summary",      "Aggregated scores per experiment",             "score_overall",          "overall",
  "Accuracy",     "Per metric",                                   "benchmark_metric",       "benchmark",
  "Accuracy",     "Per dataset source",                           "benchmark_source",       "benchmark",
  "Accuracy",     "Errors",                                       "benchmark_execution",    "benchmark",
  "Scalability",  "Predicted time\n(#cells \u00D7 #features)",    "scaling_predtime",       "scaling",
  "Stability",    "Similarity\nbetween runs",                     "stability",              "stability",
  "Usability",    "Quality of\nsoftware and paper",               "qc_category",            "qc"
) 

Experiment:group的Experiment信息(下图红框

Category:group的Category信息(下图绿框

group:列的分组(同column_info中的group)

palette:group使用何种palette

4,设置 palettes

设置不同palette的颜色


error_reasons <- tibble(
  name = c("pct_memory_limit", "pct_time_limit", "pct_execution_error", "pct_method_error"),
  label = c("Memory limit exceeded", "Time limit exceeded", "Execution error", "Method error"),
  colour = RColorBrewer::brewer.pal(length(name), "Set3")
)
palettes <- tribble(
  ~palette,             ~colours,
  "overall",            grDevices::colorRampPalette(rev(RColorBrewer::brewer.pal(9, "Greys")[-1]))(101),
  "benchmark",          grDevices::colorRampPalette(rev(RColorBrewer::brewer.pal(9, "Blues") %>% c("#011636")))(101),
  "scaling",            grDevices::colorRampPalette(rev(RColorBrewer::brewer.pal(9, "Reds")[-8:-9]))(101),
  "stability",          grDevices::colorRampPalette(rev(RColorBrewer::brewer.pal(9, "YlOrBr")[-7:-9]))(101),
  "qc",                 grDevices::colorRampPalette(rev(RColorBrewer::brewer.pal(9, "Greens")[-1] %>% c("#00250f")))(101),
  "error_reasons",      error_reasons %>% select(label, colour) %>% deframe(),
  "white6black4",       c(rep("white", 3), rep("black", 7))
)


5,绘制funky heatmap

完成上述设置后终于可以绘制funky heatmap


g <- funky_heatmap(
  data = data,
  column_info = column_info,
  column_groups = column_groups,
  row_info = row_info,
  row_groups = row_groups,
  palettes = palettes,
  col_annot_offset = 3.2
)
g
#保存输出
#ggsave("path_to_plot.pdf", g, device = cairo_pdf, width = g$width, height = g$height)

OK,这样每个患者(id)的临床信息,组学信息,分组信息,就都可以可视化展示了!


相关文章
|
数据采集 芯片
GWAS全基因组关联分析入门教程
GWAS全基因组关联分析入门教程
|
2月前
|
数据采集 数据可视化 数据挖掘
使用R语言进行生物统计分析:探索生命科学的奥秘
【9月更文挑战第1天】通过上述实例,我们可以看到R语言在生物统计分析中的强大功能。从数据准备、差异表达分析到结果可视化,R语言提供了一整套完整的解决方案。随着生物数据的不断积累和分析技术的不断进步,R语言在生物统计分析中的应用前景将更加广阔。我们相信,通过不断学习和实践,R语言将成为每一位生物统计学家不可或缺的工具。
|
3月前
|
数据可视化 数据挖掘 Python
在模仿中精进数据可视化04:旧金山街道树木分布可视化
在模仿中精进数据可视化04:旧金山街道树木分布可视化
|
6月前
|
数据可视化
R语言大学城咖啡店消费问卷调查数据报告:信度分析、主成分分析可视化
R语言大学城咖啡店消费问卷调查数据报告:信度分析、主成分分析可视化
|
6月前
|
算法 数据可视化 网络可视化
R语言Apriori算法关联规则对中药用药复方配伍规律药方挖掘可视化(上)
R语言Apriori算法关联规则对中药用药复方配伍规律药方挖掘可视化
R语言Apriori算法关联规则对中药用药复方配伍规律药方挖掘可视化(上)
|
6月前
|
SQL 数据可视化 算法
R语言公交地铁路线进出站数据挖掘网络图可视化
R语言公交地铁路线进出站数据挖掘网络图可视化
|
6月前
|
数据可视化 定位技术
Tableau 数据可视化:探索性图形分析新生儿死亡率数据
Tableau 数据可视化:探索性图形分析新生儿死亡率数据
|
6月前
|
数据可视化
共享单车数据可视化分析|附代码数据
共享单车数据可视化分析|附代码数据
|
6月前
|
数据可视化 数据挖掘
R语言多维度视角下白领人群健康体质检测数据关系可视化分析2
R语言多维度视角下白领人群健康体质检测数据关系可视化分析
|
6月前
|
数据可视化 数据挖掘
R语言多维度视角下白领人群健康体质检测数据关系可视化分析1
R语言多维度视角下白领人群健康体质检测数据关系可视化分析