一个比较好看的r markdown模板

简介: 一个比较好看的r markdown模板

来自于论文

Removing unwanted variation from large-scale
RNA sequencing data with PRPS

论文里提供了很多的数据和代码

链接是 GitHub - RMolania/TCGA_PanCancer_UnwantedVariation

这个模板需要用到 rmdformats 这个R包

image.png

image.png

rmarkdown 表头内容

---
title: "Removing tumour purity, library size and batch effects from the TCGA breast cancer RNA-seq data using RUV-III-PRPS"
author:
- name: Ramyar Molania
  affiliation: Papenfuss Lab, Bioinformatics, WEHI.
  url: https://www.wehi.edu.au/people/tony-papenfuss
date: "15-02-2020"
output:
  rmdformats::readthedown:
    code_folding: hide
    gallery: yes
    highlight: tango
    lightbox: yes
    self_contained: yes
    thumbnails: no
    number_sections: yes
    toc_depth: 3
    use_bookdown: yes
  html_document2:
    df_print: paged
  html_document:
    toc_depth: '3'
    df_print: paged
params:
  update_date: !r paste("Last updated on:", Sys.Date())
editor_options:
  chunk_output_type: console
---
`r params$update_date`

<style type="text/css">
h1.title {
  font-size: 28px;
  color: DarkRed;
}
h1 { /* Header 1 */
  font-size: 24px;
  color: DarkBlue;
}
h2 { /* Header 2 */
    font-size: 20px;
  color: DarkBlue;
}
h3 { /* Header 3 */
    font-size: 18px;
  color: DarkBlue;
}
h4 { /* Header 3 */
    font-size: 16px;
  color: DarkBlue;
}
</style>

<style>
p.caption {
  font-size: 46em;
  font-style: italic;
  color: black;
}
</style>



#```{r setup, include=F}
knitr::opts_chunk$set(
  tidy = FALSE,
  fig.width = 10,
  message = FALSE,
  warning = FALSE)
#```

# Introduction

Effective removal of unwanted variation is essential to derive meaningful biological results from RNA-seq data, particularly when the data comes from large and complex studies. We have previously proposed a new method, removing unwanted variation III (RUV-III) to normalize gene expression data [(R.Molania, NAR, 2019)](https://academic.oup.com/nar/article/47/12/6073/5494770?login=true). The RUV-III method requires well-designed technical replicates (well-distributed across sources of unwanted variation) and negative control genes to estimate known and unknown sources of unwanted variation and remove it from the data.\
We propose a novel strategy, pseudo-replicates of pseudo-samples (PRPS) [R.Molania, bioRxiv, 2021](https://www.biorxiv.org/content/10.1101/2021.11.01.466731v1), for deploying RUV-III to normalize RNA-seq data in situations when technical replicates are not available or are not well-designed. Our approach requires at least one **roughly** known biologically homogenous subclass of samples presented across sources of unwanted variation. For example, in a cancer RNA-seq study where there are normal tissues present across all sources of unwanted variation. Then, we can use these samples to create PRPS.\
To create PRPS, we first need to identify the sources of unwanted variation, which we call batches in the data. Then the gene expression measurements of suitable biologically homogeneous sets of samples are averaged within batches, and the results called pseudo-samples. Since the variation between pseudo-samples in different batches is mainly unwanted variation, by defining them as pseudo-replicates and used them in RUV-III as replicates, we can easily and effectively remove the unwanted variation. we refer to our paper for more technical details [R.Molania, bioRxiv, 2021](https://www.biorxiv.org/content/10.1101/2021.11.01.466731v1).\

Here, we use the TCGA invasive breast cancer (BRCA) RNA-seq data as an example to show how to remove tumour purity, flow cell chemistry, library size and batch effects (plate effects) from the data. We illustrate the value of our approach by comparing it to the standard TCGA normalizations on the TCGA BRCA RNA-seq data. Further, we demonstrate how unwanted variation can compromise several downstream analyses and can lead to wrong biological conclusions. We will also assess the performance of RUV-III with poorly chosen PRPS and in situations where biological labels are only partially known.\
Note that RUV-III with PRPS is not limited to TCGA data: it can be used for any large genomics project involving multiple labs, technicians, platforms, ...\

## Data preparation

The TCGA consortium aligned RNA sequencing reads to the hg38 reference genome using the STAR aligner and quantified the results at gene level using the HTseq and Gencode v22 gene-annotation [Ref](https://docs.gdc.cancer.gov/Data/Bioinformatics_Pipelines/Expression_mRNA_Pipeline/). The TCGA RNA-seq data are publicly available in three formats: raw counts, FPKM and FPKM with upper-quartile normalization (FPKM.UQ). All these formats for individual cancer types (33 cancer types, ~ 11000 samples) were downloaded using the R/Bioconductor package (version 2.16.1). The TCGA normalized microarray gene expression data were downloaded from the Broad GDAC [Firehose](https://gdac.broadinstitute.org) repository , data version 2016/01/28. Tissue source sites (TSS), and batches of sequencing-plates were extracted from individual TCGA [patient barcodes](https://docs.gdc.cancer.gov/Encyclopedia/pages/TCGA_Barcode/), and sample processing times were downloaded from the [MD Anderson Cancer Centre TCGA Batch Effects website](https://bioinformatics.mdanderson.org/public-software/tcga-batch-effects). Pathological features of cancer patients were downloaded from the Broad GDAC Firehose repository (https://gdac.broadinstitute.org). The details of processing the TCGA BRCA RNA-seq samples using two flow cell chemistries were received by personal communication from Dr. K Hoadley. The TCGA survival data reported by [Liu et al.](https://www.cell.com/cell/fulltext/S0092-8674(18)30229-0?_returnURL=https%3A%2F%2Flinkinghub.elsevier.com%2Fretrieve%2Fpii%2FS0092867418302290%3Fshowall%3Dtrue) were used in this paper. The consensus measurement of purity estimation (CPE) were downloaded from the [Aran et al](https://www.nature.com/articles/ncomms9971) study.\
We have generated SummarizedExperiment objects for all the TCGA RNA-seq datasets. These datasets can be found here [TCGA_PanCancerRNAseq](https://zenodo.org/record/6326542#.YimR0C8Rquo). Unwanted variation of all the datasets can be explored using an Rshiny application published in [(R.Molania, bioRxiv, 2021)](https://www.biorxiv.org/content/10.1101/2021.11.01.466731v1.article-metrics).\
All datasets that are required for this vignette can be found here [link](https://doi.org/10.5281/zenodo.6392171)

# TCGA BRCA gene expression data

## RNA-seq data

We load the TCGA_SummarizedExperiment_HTseq_BRCA.rds file. This is a SummarizedExperiment object that contains:\
**assays:**\
-Raw counts\
-FPKM\
-FPKM.UQ\
**colData:**\
-Batch information\
-Clinical information (collected from different resources)\
**rowData:**\
-Genes' details (GC, chromosome, ...)\
-Several lists of housekeeping genes\
  
The lists of housekeeping genes might be suitable to use as negative control genes (NCG) for the RUV-III normalization.

效果

image.png

相关文章
|
2月前
|
前端开发 NoSQL Java
极简Markdown程序员简历模板
这是一份简洁明了的Markdown简历模板和在线编辑工具分享,适用于寻找Java工程师、前端工程师或全栈工程师职位的求职者。模板详细列出了个人信息、联系方式、技能清单及丰富的工作经验,适合用于制作专业的求职简历。
39 6
|
前端开发
Markdown 模板-实用接口文档的
源码地址 https://gitee.com/kaiLee/markdown- template/blob/master/%E6%8E%A5%E5%8F%A3%E6%96%87%E6%A1%A3%E6%A8%A1%E6%9D%BF.md 展示效果 简要描述: 参数名 必选 类型 说明 举例 acceptNo 是 string 受理号 6120200302000148 • 该接口不返回保单号 和 投保人姓名数据, 需要前端从上个页面携带 cURL 描述: curl --location --request POST 'http://aaa.bbb.ccc/ddd/product/qualityI
213 0
|
存储 设计模式
使用Markdown写博客的一些模板
傍晚时分,你坐在屋檐下,看着天慢慢地黑下去,心里寂寞而凄凉,感到自己的生命被剥夺了。当时我是个年轻人,但我害怕这样生活下去,衰老下去。在我看来,这是比死亡更可怕的事。--------王小波
526 0
|
5月前
|
程序员 Linux iOS开发
一款比Typora更简洁优雅的Markdown编辑器神器(完全开源免费)
一款比Typora更简洁优雅的Markdown编辑器神器(完全开源免费)
225 1
|
5月前
《使用「Markdown」编辑器的那些天 |CSDN编辑器测评》
《使用「Markdown」编辑器的那些天 |CSDN编辑器测评》
53 0
|
5月前
|
机器学习/深度学习 uml
Markdown编辑器用法保存自用
Markdown编辑器用法保存自用
|
2月前
|
存储 安全 数据安全/隐私保护
Django 后端架构开发:富文本编辑器权限管理与 UEditor 、Wiki接入,实现 Markdown 文本编辑器
Django 后端架构开发:富文本编辑器权限管理与 UEditor 、Wiki接入,实现 Markdown 文本编辑器
79 0
|
8天前
|
JavaScript 前端开发 API
vue3 v-md-editor markdown编辑器(VMdEditor)和预览组件(VMdPreview )的使用
本文介绍了如何在Vue 3项目中使用v-md-editor组件库来创建markdown编辑器和预览组件。文章提供了安装步骤、如何在main.js中进行全局配置、以及如何在页面中使用VMdEditor和VMdPreview组件的示例代码。此外,还提供了一个完整示例的链接,包括编辑器和预览组件的使用效果和代码。
vue3 v-md-editor markdown编辑器(VMdEditor)和预览组件(VMdPreview )的使用
|
4月前
|
存储 移动开发 编解码
基于HTML5开发的Markdown在线编辑器
Markdown是一种轻量级标记语言,以其简洁易读的格式而备受程序员和作者们的青睐。随着互联网的发展,越来越多的在线Markdown编辑器应运而生,为用户提供了更加便捷、高效的写作和编辑环境。本文将探讨基于HTML5开发的Markdown在线编辑器的设计原理、功能特点以及技术优势。
106 4
下一篇
无影云桌面