官方参考文档:
https://mgi-tech-bioinformatics.github.io/DNBelab_C_Series_HT_scRNA-analysis-software/Document/README.html
1.参考基因组准备:
拟南芥为例,下载,索引建立:
# Download reference files
#wget https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-62/fasta/arabidopsis_thaliana/dna/Arabidopsis_thaliana.TAIR10.dna.toplevel.fa.gz
#wget https://ftp.ensemblgenomes.ebi.ac.uk/pub/plants/release-62/gtf/arabidopsis_thaliana/Arabidopsis_thaliana.TAIR10.62.gtf.gz
# Extract files
#gzip -d Arabidopsis_thaliana.TAIR10.dna.toplevel.fa.gz
#gzip -d Arabidopsis_thaliana.TAIR10.62.gtf.gz
#python /home/us001/single-cell/scripts/gtf_add_PT_MT_prefix.py --chloroplast-ids Pt --mitochondria-ids Mt -i Arabidopsis_thaliana.TAIR10.62.gtf -o Arabidopsis_thaliana.TAIR10.62.MtPt.gtf --regex-field gene_biotype gene_name gene_id transcript_id --regex-pattern "ribosomal|rRNA|rps|rpl" --regex-prefix RP-
# Build reference
dnbc4tools tools mkgtf --ingtf Arabidopsis_thaliana.TAIR10.62.gtf --output genes.filter.gtf --type gene_biotype \
--include protein_coding,lncRNA,lincRNA,antisense,IG_V_gene,IG_LV_gene,IG_J_gene,IG_C_gene,IG_V_pseudogene,IG_J_pseudogene,IG_C_pseudogene,TR_V_gene,TR_D_gene,TR_J_gene,TR_C_gene,rRNA
dnbc4tools rna mkref --ingtf genes.filter.gtf --fasta Arabidopsis_thaliana.TAIR10.dna.toplevel.fa --threads 10 --species Arabidopsis_thaliana
2.示例数据下载:
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809366/CNX0734939/CNR0835613/FP200003884_L01_31_2.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809365/CNX0734938/CNR0835612/FP200003884_L01_27_2.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809368/CNX0734941/CNR0835615/DP8400024433BL_L01_80_2.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809367/CNX0734940/CNR0835614/DP8400024433BL_L01_78_2.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809361/CNX0734936/CNR0835610/DP8400025960BR_L01_97_1.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809366/CNX0734939/CNR0835613/FP200003884_L01_31_1.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809363/CNX0734937/CNR0835611/DP8400026128BL_L01_71_1.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809363/CNX0734937/CNR0835611/DP8400026128BL_L01_71_2.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809368/CNX0734941/CNR0835615/DP8400024433BL_L01_80_1.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809365/CNX0734938/CNR0835612/FP200003884_L01_27_1.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809367/CNX0734940/CNR0835614/DP8400024433BL_L01_78_1.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data5/CNP0004389/CNS0809361/CNX0734936/CNR0835610/DP8400025960BR_L01_97_2.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data2/CNP0004389/CNS1008488/CNX0953943/CNR1093075/FP200005151_L01_95_1.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data2/CNP0004389/CNS1008488/CNX0953943/CNR1093075/FP200005151_L01_95_2.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data2/CNP0004389/CNS1008489/CNX0953944/CNR1093076/DP8400026127BL_L01_70_1.fq.gz
wget -c ftp://ftp.cngb.org/pub/CNSA/data2/CNP0004389/CNS1008489/CNX0953944/CNR1093076/DP8400026127BL_L01_70_2.fq.gz
3.dnbc4tools工具表达定量分析:
dnbc4tools rna run \
--name 1D1 \
--cDNAfastq1 ~/BGI-sc/data/ara/DP8400025960BR_L01_97_1.fq.gz \
--cDNAfastq2 ~/BGI-sc/data/ara/DP8400025960BR_L01_97_2.fq.gz \
--oligofastq1 ~/BGI-sc/data/ara/DP8400026128BL_L01_71_1.fq.gz \
--oligofastq2 ~/BGI-sc/data/ara/DP8400026128BL_L01_71_2.fq.gz \
--genomeDir ~/BGI-sc/ref/arabidopsis_thaliana/Arabidopsis_thaliana \
--threads 20
dnbc4tools rna run \
--name 1D2 \
--cDNAfastq1 ~/BGI-sc/data/ara/FP200005151_L01_95_1.fq.gz \
--cDNAfastq2 ~/BGI-sc/data/ara/FP200005151_L01_95_2.fq.gz \
--oligofastq1 ~/BGI-sc/data/ara/DP8400026127BL_L01_70_1.fq.gz \
--oligofastq2 ~/BGI-sc/data/ara/DP8400026127BL_L01_70_2.fq.gz \
--genomeDir ~/BGI-sc/ref/arabidopsis_thaliana/Arabidopsis_thaliana \
--threads 20
dnbc4tools rna run \
--name 4D1 \
--cDNAfastq1 ~/BGI-sc/data/ara/FP200003884_L01_27_1.fq.gz \
--cDNAfastq2 ~/BGI-sc/data/ara/FP200003884_L01_27_2.fq.gz \
--oligofastq1 ~/BGI-sc/data/ara/DP8400024433BL_L01_78_1.fq.gz \
--oligofastq2 ~/BGI-sc/data/ara/DP8400024433BL_L01_78_2.fq.gz \
--genomeDir ~/BGI-sc/ref/arabidopsis_thaliana/Arabidopsis_thaliana \
--threads 20
dnbc4tools rna run \
--name 4D2 \
--cDNAfastq1 ~/BGI-sc/data/ara/FP200003884_L01_31_1.fq.gz \
--cDNAfastq2 ~/BGI-sc/data/ara/FP200003884_L01_31_2.fq.gz \
--oligofastq1 ~/BGI-sc/data/ara/DP8400024433BL_L01_80_1.fq.gz \
--oligofastq2 ~/BGI-sc/data/ara/DP8400024433BL_L01_80_2.fq.gz \
--genomeDir ~/BGI-sc/ref/arabidopsis_thaliana/Arabidopsis_thaliana \
--threads 20
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!
