univariate_cox_batch.r 基因表达量批量单因素cox回归分析

univariate_cox_batch.r 基因表达量批量单因素cox回归分析

使用说明:

输入生存数据与基因表达量可以做批量单因素cox回归分析

$ Rscript $scriptdir/univariate_cox_batch.r --h
usage:  univariate_cox_batch.r [-h] -m metadata -g
                                                         expset [-t time]
                                                         [-e event]
                                                         [-l pvalue]
                                                         [-b blocksize]
                                                         [--log2] [-o outdir]
                                                         [-p prefix]
batch unvariate cox regression gene expression
optional arguments:
  -h, --help            show this help message and exit
  -m metadata, --metadata metadata
                        input metadata file path with suvival time [required]
  -g expset, --expset expset
                        input gene expression set file [required]
  -t time, --time time  set suvival time column name in metadata [default
                        TIME]
  -e event, --event event
                        set event column name in metadata [default EVENT]
  -l pvalue, --pvalue pvalue
                        pvalue cutoff to choose sig gene [default 0.01]
  -b blocksize, --blocksize blocksize
                        Number of variables Parallel to test in each [default
                        2]
  --log2                whether do log2 transfrom for expression data
                        [optional, default: False]
  -o outdir, --outdir outdir
                        output file directory [default cwd]
  -p prefix, --prefix prefix
                        out file name prefix [default cox]


参数说明:

-m 输入生存数据:

event 列:0表示alive,1表示死亡

barcode TIME EVENT
TCGA-B7-A5TK-01A-12R-A36D-31 288 0
TCGA-BR-7959-01A-11R-2343-13 1010 0
TCGA-IN-8462-01A-11R-2343-13 572 0
TCGA-CG-4443-01A-01R-1157-13 912 0
TCGA-KB-A93J-01A-11R-A39E-31 1124 0
TCGA-HU-A4H3-01A-21R-A251-31 882 0
TCGA-RD-A8MV-01A-11R-A36D-31 3720 0
TCGA-VQ-A91X-01A-12R-A414-31 289 1
TCGA-D7-8575-01A-11R-2343-13 554 1
TCGA-BR-8485-01A-11R-2402-13 280 0
TCGA-D7-A748-01A-12R-A32D-31 132 1
TCGA-VQ-A91Z-01A-11R-A414-31 1690 0


-g 输入基因表达量文件

ID TCGA-B7-A5TK-01A-12R-A36D-31 TCGA-BR-7959-01A-11R-2343-13 TCGA-IN-8462-01A-11R-2343-13 TCGA-BR-A4CR-01A-11R-A24K-31 TCGA-CG-4443-01A-01R-1157-13 TCGA-KB-A93J-01A-11R-A39E-31 TCGA-BR-4371-01A-01R-1157-13 TCGA-IN-A6RO-01A-12R-A33Y-31 TCGA-HU-A4H3-01A-21R-A251-31
FGR 16.34408 11.96739 5.350846 2.209351 1.53802 15.24016 4.501118 2.602437 6.261761
CD38 86.86772 15.79451 3.111342 1.240707 0.862955 13.3047 3.728708 1.673952 2.675173
ITGAL 40.26903 7.358566 3.769125 2.387869 2.37351 38.08591 8.305283 3.622781 7.025886
CX3CL1 603.0132 26.91353 20.22238 4.195262 19.04097 14.15295 13.75885 6.675374 4.050271
CEACAM21 1.868536 2.571917 0.610839 0.674558 1.092127 3.483559 1.134309 4.471274 0.584159
MATK 2.28342 0.864116 0.519776 2.442093 0.760348 3.192951 1.161881 0.347882 1.039336
CD79B 3.453198 1.879957 2.822192 0.523587 1.926592 3.651742 0.831288 0.883643 1.979214
MMP25 13.72829 3.451148 1.106563 1.131217 0.878735 10.43186 1.475852 1.914284 2.312993
TRAF3IP3 5.24401 1.880186 0.875264 0.756153 0.603251 3.325013 2.347473 0.570462 1.315916
CD4 77.74691 51.83719 22.77076 11.07811 35.20445 122.5578 31.10107 15.06619 15.41347
BTK 6.856235 4.362261 1.482688 1.371599 1.981236 6.91154 3.187848 0.955499 1.48269
FMO1 7.168567 7.711817 3.223174 0.979034 0.450307 1.093412 1.001808 0.910204 1.558515
SYT7 1.153105 81.94068 2.673384 191.6112 82.49394 0.510373 4.470482 1.28506 0.91944
TYROBP 591.7796 338.0271 184.8133 69.18483 150.6397 480.5691 121.096 72.4588 116.9793
CD22 0.819295 2.521607 1.588505 0.41259 0.387288 1.123633 0.488244 0.258094 0.713988


使用举例:

Rscript $scriptdir/univariate_cox_batch.r -m metadata_survival_time.tsv \
   -g deg_gene_exp_tpm.tsv  -e EVENT -t TIME -p  imm.unicox  --pvalue 0.01


批量cox分析 结果:

Variable Term Beta StandardError Z P LRT Wald LogRank HR HRlower HRupper
SYT12 SYT12 0.091121 0.019495 4.674035 2.95E-06 0.000128 2.95E-06 1.80E-06 1.095402 1.054336 1.138067
CDH2 CDH2 0.013266 0.003014 4.401803 1.07E-05 0.001993 1.07E-05 2.11E-07 1.013354 1.007386 1.019357
GPNMB GPNMB 0.002759 0.000768 3.590118 0.000331 0.001313 0.000331 0.000409 1.002762 1.001253 1.004274
TMIGD3 TMIGD3 0.06788 0.019314 3.514468 0.000441 0.001248 0.000441 0.000419 1.070237 1.03048 1.111528
LINC01094 LINC01094 0.132441 0.040375 3.280293 0.001037 0.002242 0.001037 0.001026 1.141611 1.054754 1.235621
SLC22A20P SLC22A20P 0.049415 0.01583 3.12165 0.001798 0.012065 0.001798 0.000736 1.050656 1.018559 1.083765
IGHV4-61 IGHV4-61 0.001791 0.000582 3.077573 0.002087 0.00859 0.002087 0.001742 1.001793 1.000651 1.002937
IGHV2-5 IGHV2-5 0.002236 0.000737 3.034961 0.002406 0.007782 0.002406 0.002205 1.002238 1.000792 1.003686
SERPINA5 SERPINA5 0.007681 0.002558 3.002428 0.002678 0.009799 0.002678 0.002064 1.007711 1.00267 1.012776
MS4A4A MS4A4A 0.01446 0.00505 2.863218 0.004194 0.008427 0.004194 0.004722 1.014565 1.004572 1.024657
FAM83A FAM83A 0.006237 0.002368 2.633445 0.008452 0.028523 0.008452 0.007295 1.006256 1.001596 1.010938
IGLV3-9 IGLV3-9 0.000547 0.00021 2.608281 0.0091 0.039427 0.0091 0.010244 1.000547 1.000136 1.000958
STARD3 STARD3 0.000928 0.000358 2.588913 0.009628 0.030258 0.009628 0.006743 1.000928 1.000225 1.001631


参考文献:

Therneau T (2021). A Package for Survival Analysis in R. R package version 3.2-11, https://CRAN.R-project.org/package=survival.

Terry M. Therneau, Patricia M. Grambsch (2000). Modeling Survival Data: Extending the Cox Model. Springer, New York. ISBN 0-387-98784-3.


  • 发表于 2021-06-25 16:47
  • 阅读 ( 58 )
  • 分类:临床医学

0 条评论

请先 登录 后评论
omicsgene
omicsgene

生物信息

462 篇文章

作家榜 »

  1. omicsgene 462 文章
  2. 安生水 226 文章
  3. Daitoue 167 文章
  4. 生物女学霸 120 文章
  5. CORNERSTONE 72 文章
  6. 红橙子 65 文章
  7. 生信老顽童 48 文章
  8. landy 37 文章