加速体细胞突变检测分析流程-系列2（ctDNA等高深度样本）

2023-07-27 127 发布于广东

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： 加速体细胞突变检测分析流程-系列2（ctDNA等高深度样本）

Sentieon●体细胞变异检测-系列2

Sentieon 致力于解决生物信息数据分析中的速度与准确度瓶颈，通过算法的深度优化和企业级的软件工程，大幅度提升NGS数据处理的效率、准确度和可靠性。

针对体细胞变异检测，Sentieon软件提供两个模块：TNscope和TNhaplotyer2。

TNscope：此模块使用Sentieon特有的算法，拥有更快的计算速度（提速10倍+）和更高的计算精度，对临床基因诊断样本尤其适用；

TNhaplotyper2：此模块匹配Mutect2（现在匹配到4.1.9）结果的同时，计算速度提升10倍以上。

ctDNA变异检测分析

以下给出的步骤脚本，主要针对ctDNA和其他高深度测序的样本数据(2000-5000x depth, AF > 0.3%)

第一步：Alignment

# ****************************************** 
# 1a. Mapping reads with BWA-MEM, sorting for tumor sample 
# ****************************************** 
( sentieon bwa mem -M -R "@RG\tID:$tumor\tSM:$tumor\tPL:$platform" \
-t $nt -K 10000000 $fasta $tumor_fastq_1 $tumor_fastq_2 || \
echo -n 'error' ) | \
sentieon util sort -o tumor_sorted.bam -t $nt --sam2bam -i -

# ****************************************** 
# 1b. Mapping reads with BWA-MEM, sorting for normal sample 
# ****************************************** 
( sentieon bwa mem -M -R "@RG\tID:$normal\tSM:$normal\tPL:$platform" \
-t $nt -K 10000000 $fasta $normal_fastq_1 $normal_fastq_2 || 
echo -n 'error' ) | \
sentieon util sort -o normal_sorted.bam -t $nt --sam2bam -i -

        
          
        
        
        
          
          AI 代码解读

第二步：PCR Duplicate Removal (Skip For Amplicon)

# ****************************************** 
# 2a. Remove duplicate reads for tumor sample. 
# ****************************************** 
# ******************************************  
sentieon driver -t $nt -i tumor_sorted.bam \
      --algo LocusCollector \
      --fun score_info \ tumor_score.txt sentieon driver -t $nt -i tumor_sorted.bam \
      --algo Dedup \
      --score_info tumor_score.txt \
      --metrics tumor_dedup_metrics.txt \ tumor_deduped.bam
# ****************************************** 
# 2b. Remove duplicate reads for normal sample. 
# ****************************************** 
sentieon driver -t $nt -i normal_sorted.bam \
     --algo LocusCollector \
     --fun score_info \ normal_score.txt sentieon driver -t $nt -i normal_sorted.bam \
     --algo Dedup \
     --score_info normal_score.txt \
     --metrics normal_dedup_metrics.txt \ normal_deduped.bam

        
          
        
        
        
          
          AI 代码解读

第三步: Base Quality Score Recalibration (Skip For Small Panel)

# ****************************************** 
# 3a. Base recalibration for tumor sample
# ******************************************
sentieon driver -r $fasta -t $nt -i tumor_deduped.bam --interval $BED \
    --algo QualCal \
    -k $dbsnp \
    -k $known_Mills_indels \
    -k $known_1000G_indels \ tumor_recal_data.table
# ****************************************** 
# 3b. Base recalibration for normal sample 
# ****************************************** 
sentieon driver -r $fasta -t $nt -i normal_deduped.bam --interval $BED \
     --algo QualCal \
     -k $dbsnp \
     -k $known_Mills_indels \
     -k $known_1000G_indels \ 
     normal_recal_data.table

        
          
        
        
        
          
          AI 代码解读

第四步：Variant Calling

sentieon driver -r $fasta -t $nt -i tumor_deduped.bam -i normal_deduped.bam --interval $BED -interval_padding 10\ 
    --algo TNscope \
    --tumor_sample $TUMOR_SM \
    --normal_sample $NORMAL_SM \
    --dbsnp $dbsnp \
    --sv_mask_ext 10 \
    --max_fisher_pv_active 0.05 \
    --min_tumor_allele_frac 0.01 \
    --filter_t_alt_frac 0.01 \
    --max_normal_alt_frac 0.005 \
    --max_normal_alt_qsum 200 \
    --max_normal_alt_cnt 5 \
    --assemble_mode 4 \
    [--pon panel_of_normal.vcf \] 
    output_tnscope.pre_filter.vcf.gz

        
          
        
        
        
          
          AI 代码解读

第五步：Variant Filtration

bcftools annotate -x "FILTER/triallelic_site" output_tnscope.pre_filter.vcf.gz | \ 
   bcftools filter -m + -s "low_qual" -e "QUAL < 10" | \ 
   bcftools filter -m + -s "short_tandem_repeat" -e "RPA[0]>=10" | \ 
   bcftools filter -m + -s "read_pos_bias" -e "FMT/ReadPosRankSumPS[0] < -5" | \
   bcftools norm -f $fasta -m +any | \ 
sentieon util vcfconvert - output_tnscope.filtered.vcf.gz

        
          
        
        
        
          
          AI 代码解读

Sentieon软件介绍

Sentieon为完整的纯软件基因变异检测二级分析方案，其分析流程完全忠于BWA、GATK、MuTect2、STAR、Minimap2、Fgbio、picard等金标准的数学模型。在匹配开源流程分析结果的前提下，大幅提升WGS、WES、Panel、UMI、ctDNA、RNA等测序数据的分析效率和检出精度，并匹配目前全部第二代、三代测序平台。

Sentieon软件团队拥有丰富的软件开发及算法优化工程经验，致力于解决生物数据分析中的速度与准确度瓶颈，为来自于分子诊断、药物研发、临床医疗、人群队列、动植物等多个领域的合作伙伴提供高效精准的软件解决方案，共同推动基因技术的发展。

截至2023年3月份，Sentieon已经在全球范围内为1300+用户提供服务，被世界一级影响因子刊物如NEJM、Cell、Nature等广泛引用，引用次数超过700篇。此外，Sentieon连续数年摘得了Precision FDA、Dream Challenges等多个权威评比的桂冠，在业内获得广泛认可。

软件试用：https://www.insvast.com/sentieon

加速体细胞突变检测分析流程-系列2（ctDNA等高深度样本）

Sentieon●体细胞变异检测-系列2

ctDNA变异检测分析

Sentieon软件介绍

热门文章

最新文章

相关课程

相关电子书

相关实验场景

探索云世界

热门

云计算

大数据

云原生

人工智能

数据库

开发与运维

活动广场

任务中心

训练营

直播

乘风者计划

下载

镜像站

技术资料

加速体细胞突变检测分析流程-系列2（ctDNA等高深度样本）

Sentieon●体细胞变异检测-系列2

ctDNA变异检测分析

Sentieon软件介绍

热门文章

最新文章

相关课程

相关电子书

相关实验场景