参考官方文档创建conda环境:https://github.com/aertslab/create_cisTarget_databases
鸭子与人的同源基因创建:
python gff_gene2symbol.py GCF_047663525.1.gff gene2symbol.txt
python filter_tbl.py gene2symbol.txt /share/backup01/database/SCENIC/human/motifs-v10nr_clust-nr.hgnc-m0.001-o0.0.tbl --output motifs-v10nr_clust-nr.duck-m0.001-o0.0.tbl
grep -v '^#' motifs-v10nr_clust-nr.duck-m0.001-o0.0.tbl|cut -f 1 |sort -u > motif_all.txt
ls ../v10nr_clust_public/singletons | cut -d '.' -f 1 > motif_nr.txt
grep -Ff motif_all.txt motif_nr.txt > motifs_duck.txt
python3 extract_upstream.py GCF_047663525.1.fna GCF_047663525.1.gff gene_upstream1k.fasta 1000 0 -n Name
# FASTA file with sequences per region IDs / gene IDs.
fasta_filename=gene_upstream1k.fasta
# Directory with motifs in Cluster-Buster format.
motifs_dir=../v10nr_clust_public/singletons
# File with motif IDs (base name of motif file in ${motifs_dir}).
motifs_list_filename=motifs_duck.txt
# cisTarget motif database output prefix.
db_prefix=duck_up1kb_down0kb
nbr_threads=24
conda activate create_cistarget_databases
create_cistarget_databases_dir=/share/work/biosoft/create_cisTarget_databases/create_cisTarget_databases/
"${create_cistarget_databases_dir}/create_cistarget_motif_databases.py" \
-f "${fasta_filename}" \
-M "${motifs_dir}" \
-m "${motifs_list_filename}" \
-o "${db_prefix}" \
-t "${nbr_threads}"
如果觉得我的文章对您有用,请随意打赏。你的支持将鼓励我继续创作!
