测序量低建议最低4X,测序量大建议10X以上保证质量;最高深度不要超过1000X
例如:1)猪这篇文章报道,深度低于4XSNP错误率大幅上升:https://link.springer.com/article/10.1186/s12859-019-3164-z
2) 超高深度的SNP位点可能位于基因组重复区,建议删除:https://www.nature.com/articles/nbt.2053

GATK过滤命令行:
SNP
gatk VariantFiltration \
-V snps.vcf.gz \
-filter "QD < 2.0" --filter-name "QD2" \
-filter "QUAL < 30.0" --filter-name "QUAL30" \
-filter "SOR > 3.0" --filter-name "SOR3" \
-filter "FS > 60.0" --filter-name "FS60" \
-filter "MQ < 40.0" --filter-name "MQ40" \
-filter "MQRankSum < -12.5" --filter-name "MQRankSum-12.5" \
-filter "ReadPosRankSum < -8.0" --filter-name "ReadPosRankSum-8" \
-O snps_filtered.vcf.gz
INDEL
gatk VariantFiltration \
-V indels.vcf.gz \
-filter "QD < 2.0" --filter-name "QD2" \
-filter "QUAL < 30.0" --filter-name "QUAL30" \
-filter "FS > 200.0" --filter-name "FS200" \
-filter "ReadPosRankSum < -20.0" --filter-name "ReadPosRankSum-20" \
-O indels_filtered.vcf.gz
如果想自己过滤,这里有视频课程操作过程课程:https://bdtcd.xetslk.com/s/1VQOjQ
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!
