植物单细胞分析方法

植物单细胞分析方法




Cell calling

To identify high-quality nuclei (a term used interchangeably with “barcodes”) using the filtered set of alignments, we implemented heuristic cutoffs for genomic context and sequencing depth indicative of high-quality nuclei. Specifically, we fit a smoothed spline to the log10 transformed unique Tn5 integration sites per nucleus (response) against the ordered log10 barcode rank (decreasing per-nucleus unique Tn5 integration site counts) using the smooth.spline function (spar = 0.01) from base R (Team, 2013). We then used the fitted values from the smoothed spline model to estimate the first derivative (slope), taking the local minima within the first 16,000 barcodes as a potential knee/inflection point (16,000 was selected to match the maximum number of input nuclei). We set the unique Tn5 library depth threshold to the lesser of 1,000 reads and the knee/inflection point, excluding all barcodes below the threshold. Spurious integration patterns throughout the genome can be representative of incomplete Tn5 integration, fragmented/low-quality nuclei, or poor sequence recovery, among other sources of technical noise. In contrast, high quality nuclei often demonstrate a strong aggregate accessibility signal near TSSs. Therefore, we implemented two approaches for estimating signal-noise ratios in our scATAC-seq data. First, nuclei below two standard deviations from the mean fraction of reads mapping to within 2-kb of TSSs were removed on a per-library basis. Then, we estimated TSS enrichment scores by calculating the average per-bp coverage of 2-kb windows surrounding TSSs, scaling by the average per-bp coverage of the first and last 100-bp in the window (background estimate; average of 1-100-bp and 1901-2000-bp), and smoothing the scaled signal with rolling-means (R package; Zoo). Per barcode TSS enrichment scores were taken as the maximum signal within 250-bp of the TSS. Lastly, for each library, we removed any barcode with a proportion of reads mapping to chloroplast and mitochondrial genomes greater than two standard deviations from the mean of the library.

https://www.cell.com/cell/fulltext/S0092-8674(21)00493-1
  • 发表于 19小时前
  • 阅读 ( 13 )
  • 分类:转录组

你可能感兴趣的文章

相关问题

0 条评论

请先 登录 后评论
omicsgene
omicsgene

生物信息

754 篇文章

作家榜 »

  1. omicsgene 754 文章
  2. 安生水 368 文章
  3. Daitoue 167 文章
  4. 生物女学霸 120 文章
  5. xun 94 文章
  6. rzx 87 文章
  7. 红橙子 81 文章
  8. Ti Amo 75 文章