
skipped exon (SE),外显子跳跃,指一个或多个外显子连同其两端的内含子一起被剪切,在成熟mRNA中不存在。
alternative 5' splice site (A5SS),5’端可变剪接,它们的3’端剪接位点一致但5’端剪接位点不同,产生不同长度的5’端外显子。
alternative 5' splice site (A3SS),3’端可变剪接,它们的5’端剪接位点一致但3’端剪接位点不同,产生不同长度的3’端外显子。
mutually exclusive exons (MXE),外显子互斥,成熟的mRNA变体中,彼此特有的外显子,这些外显子不能同时出现在同一成熟mRNA中。
retained intron (RI),内含子保留,在一些转录本中内含子不会被剪切掉,保留在成熟的mRNA。
rMATS采用exon inclusion level 来定义样本中可变剪切事件的表达量,以外显子跳跃(Skipped Exon)为例,正常的转录本称之为Exon Inclusion Isofrom, 发生了外显子跳跃的转录本则称之为Exon Skipping Isofrom。

用 I 表示比对到Exon Inclusion Isofrom上的reads,S表示比对到Exon Skipping Isofrom上的reads, 则该外显子跳跃的可变剪切事件比例可以表示为:

可以看到,exon inclusion level实际上是inclusion isofrom所占的比例,计算时,用长度校正了原始的reads数。其他类型的可变剪切事件也可以划分成上述两种isoform, 示意图如下:

conda create -n my_rmats_env conda activate my_rmats_env conda install rmats-turbo rmats-turbo --version
python rmats.py -h
usage: rmats.py [options]
optional arguments:
-h, --help show this help message and exit
--version show program's version number and exit
--gtf GTF An annotation of genes and transcripts in GTF format
--b1 B1 A text file containing a comma separated list of the
BAM files for sample_1. (Only if using BAM)
--b2 B2 A text file containing a comma separated list of the
BAM files for sample_2. (Only if using BAM)
--s1 S1 A text file containing a comma separated list of the
FASTQ files for sample_1. If using paired reads the
format is ":" to separate pairs and "," to separate
replicates. (Only if using fastq)
--s2 S2 A text file containing a comma separated list of the
FASTQ files for sample_2. If using paired reads the
format is ":" to separate pairs and "," to separate
replicates. (Only if using fastq)
--od OD The directory for final output
--tmp TMP The directory for intermediate output such as ".rmats"
files from the prep step
-t {paired,single} Type of read used in the analysis: either "paired" for
paired-end data or "single" for single-end data.
Default: paired
--libType {fr-unstranded,fr-firststrand,fr-secondstrand}
Library type. Use fr-firststrand or fr-secondstrand
for strand-specific data. Default: fr-unstranded
--readLength READLENGTH
The length of each read
--variable-read-length
Allow reads with lengths that differ from --readLength
to be processed. --readLength will still be used to
determine IncFormLen and SkipFormLen
--anchorLength ANCHORLENGTH
The anchor length. Default is 1
--tophatAnchor TOPHATANCHOR
The "anchor length" or "overhang length" used in the
aligner. At least "anchor length" NT must be mapped to
each end of a given junction. The default is 6. (Only
if using fastq)
--bi BINDEX The directory name of the STAR binary indices (name of
the directory that contains the SA file). (Only if
using fastq)
--nthread NTHREAD The number of threads. The optimal number of threads
should be equal to the number of CPU cores. Default: 1
--tstat TSTAT The number of threads for the statistical model.
Default: 1
--cstat CSTAT The cutoff splicing difference. The cutoff used in the
null hypothesis test for differential splicing. The
default is 0.0001 for 0.01% difference. Valid: 0 <=
cutoff < 1. Does not apply to the paired stats model
--task {prep,post,both,inte}
Specify which step(s) of rMATS to run. Default: both.
prep: preprocess BAMs and generate a .rmats file.
post: load .rmats file(s) into memory, detect and
count alternative splicing events, and calculate P
value (if not --statoff). both: prep + post. inte
(integrity): check that the BAM filenames recorded by
the prep task(s) match the BAM filenames for the
current command line
--statoff Skip the statistical analysis
--paired-stats Use the paired stats model
--novelSS Enable detection of novel splice sites (unannotated
splice sites). Default is no detection of novel splice
sites
--mil MIL Minimum Intron Length. Only impacts --novelSS
behavior. Default: 50
--mel MEL Maximum Exon Length. Only impacts --novelSS behavior.
Default: 500
##/path/to/b1.txt /path/to/1_1.bam,/path/to/1_2.bam ##/path/to/b2.txt /path/to/2_1.bam,/path/to/2_2.bam python rmats.py --b1 /path/to/b1.txt --b2 /path/to/b2.txt --gtf Gallus_gallus.GRCg6a.101.gtf --od A_vs_B --tmp A_vs_B/tmp -t paired --variable-read-length --readLength 150 --cstat 0.0001 --libType fr-unstranded --novelSS --nthread 4
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!
