小鼠单细胞转录组方法参考 method

小鼠单细胞转录组方法

Mouse breeding

Tissue processing and library construction for scRNA-seq（咨询测序公司）

scRNA-seq data pre-processing

Preprocessing of raw data was initially performed using Cell Ranger software pipeline (version 9.0.1)(Zheng et al. 2017). This step included reads alignment on human genome assembly GRCm39 using STAR(Dobin et al. 2013), counting of unique molecular identifier (UMI) and generating a digital gene-cell matrix. The gene expression matrix was then processed and analyzed by Seurat (version 5.3.1) package(Hao et al. 2024).

For each single-cell RNA-seq data, we created a Seurat object with the CreateSeuratObject function. All Seurat objects were merged to one and filtering steps as follows were applied for the merged data: (1) cells with fewer than 500 detected genes, more than 8000 detected genes, or with UMI counts exceeding 25,000 were excluded. (2) cells with percentage of mitochondrial UMI count larger than 15% were filtered out. (3) the genes detected in more than 3 cells were retained and all ribosomal genes were filtered out. （4）Doublets and multiplets were removed using DoubletFinder (v2.0.3)(McGinnis, Murrow, and Gartner 2019)

After quality control and filtering, each sample was processed separately for normalization and identification of highly variable genes. Gene expression matrices were normalized using Seurat’s NormalizeData function by dividing the UMI counts of each gene by the total UMIs per cell, multiplying by 10,000, and applying a log-transformation (ln(UMI-per-10,000 + 1)). Subsequently, 2,000 highly variable genes were identified in each dataset and used for principal component analysis (PCA). To correct for batch effects across samples, data integration was performed using the Harmony algorithm (Korsunsky et al., 2019) with the RunHarmony function from the Harmony package (v1.2.4) based on the first 40 principal components. The harmonized data were then used to construct a shared nearest-neighbor graph with FindNeighbors (dims = 1:40) and clustered using FindClusters (resolution = 0.8). Finally, nonlinear dimensionality reduction and visualization were conducted using RunUMAP (dims = 1:40, umap.method = “umap-learn”).

cell type annotation

After unsupervised clustering, all major dental-related cell types were identified based on well-established marker genes, i.e., Runx2, Postn, Mfap4 for dental follicle cells; Krt5, Dsc3, Pitx2 for dental epithelium cells; Dlx6os1, Plppr1, Rspo4 for dental papilla cells; Enam, Amelx, Vwde for ameloblasts; Cdh5, Pecam1, Emcn for endothelial cells; Csf1r, Cd68, Fcer1g for immune cells; Rgs5, Pdgfrb, Notch3 for perivascular cells; Plp1, Sox10, Mbp for Schwann cells; and Bglap, Phex, Ifitm5 for odontoblasts. Cluster-specific marker genes were identified by using the FindAllMarkers function with the parameter setting ‘logfc.threshold = 0.25,min.pct=0.1’.

After further rounds of unsupervised subclustering of the major cell clusters, these cell populations were subclustered into functional phenotypes using the same approach as previous described. The subclusters were annotated by selecting a specific signature gene while considering both the observed expression pattern and established conventions.

Signature score calculation

irGSEA(Fan et al. 2024) is an R package designed to assess the outcomes of various gene set scoring methods and facilitate comprehensive visualization of results. We calculated geneSet enrichment scores of cells using the irGSEA.score function and visualized via the irGSEA.density.scatterplot function

Pathway analysis

With the aim of summarizing the process of biological-term classification and the enrichment analysis of pathways, we applied ClusterProfiler(Wu et al. 2021) for GO enrichment of a given gene set. Gene Ontology (GO) annotates genes to biological processes, molecular functions, and cellular components in a directed acyclic graph structure.

DecoupleR (v2.1.4)(Badia-I-Mompel et al. 2022) is an R package containing a collection of methods adapted for biological activity estimation in single-cell data. PROGENy(Holland, Szalai, and Saez-Rodriguez 2020) is a comprehensive resource containing a curated collection of 14 pathways and their target genes, with weights for each interaction. To calculate pathway enrichment scores in different malignant ductal cell subtypes, we run the multivariate linear model (mlm) method using the function run_mlm with default parameters. For each cell in our dataset, it fitted a linear model that predicted the observed gene expression based on pathway-gene interaction weights. Once fitted, the obtained t-values of the slopes were the scores. If it is positive, we interpreted that the pathway is active and if it is negative we interpreted that it is inactive. Then, we calculated the mean scores of cells in each cell subtypes as their pathway enrichment scores

Pseudotime trajectory analysis

Monocle2 (v2.38.0) (Trapnell et al. 2014)was used to infer the state transition of the cell types and subtypes. The unique molecular identifier count matrix of the cells was used to create the CellDataSet object and then filter out the genes expressed in fewer than ten cells. Genes with q < 0.01 were identified as DEGs using the differentialGeneTest function and sorted according to q value using the setOrderingFilter function. The pseudotime trajectory was constructed using the DDRTree algorithm using default parameters. The dynamic expression changes of selected marker genes in pseudotime were visualized using the plot_genes_in_pseudotime and plot_pseudotime_heatmap functions.

Pseudotime analysis was conducted using the R package Monocle3 (v1.4.26)(Cao et al. 2019). The integrated Seurat object, along with the associated metadata, was imported into Monocle3. The new_cell_data_set function was utilized to convert the Seurat object into a Monocle3 object. A total of 50 principal components were computed using the preprocess_cds function, with cells from different groups aligned using the parameters (alignment_group = ‘SampleID’). The UMAP was generated using the reduce_dimension function with parameters (umap.min_dist = 0.1, reduction_method = ‘UMAP’, umap. n_neighbours = 15). Each cell was assigned a pseudotime value using the order_cells function, with the root state determined on the basis of the predicted potency scores calculated by CytoTRACE2 (v1.1.0)(Kang et al. 2025).

RNA velocity and estimation of the fate of cells

All the scRNA-seq data were used for the RNA velocity analysis. Firstly, the spliced and unspliced UMIs were recounted by the python package velocyto(La Manno et al. 2018). Then we normalized the count matrices by the library size and kept the genes detected in over 20 cells in both spliced and unspliced matrices, selected HVGs, log-transformed the normalized spliced and unspliced counts by the scvelo.pp.filter_and_normalize function from scvelo package(v0.2.5)(Bergen et al. 2020). Duplicate cells were moved by the scvelo.pp.remove_duplicate_cells function and moments for velocity estimation was computed by scvelo.pp.moment function with the parameter setting ‘n_pcs=30, n_neighbors=30’. The RNA velocity was estimated using the scvelo.tl.velocity function and the velocity graph was built with the cosine similarity using the scvelo.tl.velocity_graph function. The RNA velocities were projected into the UMAP coordinates with the scvelo.pl.velocity_embedding_stream function for visualization.

To further infer the fate of dental_epithelium cells, the subpopulations were subjected to the partition-based graph abstraction method PAGA using the function scv.tl.paga to further assess the most likely trajectories of cell progression. Different to what was described in the previous part, we run the dynamical model to obtain the full transcriptional dynamics of splicing kinetics. Finally, we estimated fate combining these two methods.

Cellular interaction analysis by CellChat

We used Cellchat(v2.2.0)(Jin, Plikus, and Nie 2025) to infer cell-cell interaction between each two cell subclusters. Initially, a CellChat object was established by grouping defined clusters. The “CellChatDB.mouse” ligand‒receptor interaction database was used for analysis without additional supplementation, and all preprocessing steps were performed using default parameters. The functions computeCommunProb and computeCommunProbPathway were employed to infer the network of each ligand‒receptor pair and signaling pathway individually. Visualization was achieved through a hierarchy plot, circle plot, and heatmap, each serving as a distinct form of visualization.

Correlation Analysis of snRNA-Seq and Bulk RNA-Seq

Our experimental framework comprised two wheat varieties (Chinese Spring [CS] and Dekang 961 [DK]) subjected to three salt stress durations (0 h, 3 h, 8 h), with two biological replicates per condition, generating 12 snRNA-seq datasets. For the bulk RNA-seq data, the expression values of the two biological replicates for each sample were averaged, producing an average expression matrix for six treatment groups: “CS_0h”, “CS_3h”, “CS_8h”, “DK_0h”, “DK_3h”, and “DK_8h”. This matrix was then normalized so that the sum of expression values for each sample equaled 10 000. For the snRNA-seq data, the “AverageExpression” function from the Seurat R package was used to calculate the average expression matrix for the six treatment groups. The Pearson correlation coefficient (Kirch 2008) between the bulk and snRNA-seq data for the six treatment groups was calculated using the “cor.test” function in R. The correlation results were visualized using the ggplot2 package (Ginestet 2011). Specifically, we created scatter plots with points coloured by treatment group and added trend lines to illustrate the correlation. The Pearson correlation coefficients and corresponding p-values for each treatment group are provided in Figure S1.

Reference

Badia-I-Mompel, P., Vélez Santiago, J., Braunger, J., Geiss, C., Dimitrov, D., Müller-Dott, S., Taus, P., Dugourd, A., Holland, C. H., Ramirez Flores, R. O., & Saez-Rodriguez, J. (2022). decoupleR: ensemble of computational methods to infer biological activities from omics data. Bioinformatics Advances, 2(1). https://doi.org/10.1093/BIOADV/VBAC016

Bergen, V., Lange, M., Peidli, S., Wolf, F. A., & Theis, F. J. (2020). Generalizing RNA velocity to transient cell states through dynamical modeling. Nature Biotechnology, 38(12), 1408–1414. https://doi.org/10.1038/s41587-020-0591-3

Cao, J., Spielmann, M., Qiu, X., Huang, X., Ibrahim, D. M., Hill, A. J., Zhang, F., Mundlos, S., Christiansen, L., Steemers, F. J., Trapnell, C., & Shendure, J. (2019). The single-cell transcriptional landscape of mammalian organogenesis. Nature 2019 566:7745, 566(7745), 496–502. https://doi.org/10.1038/s41586-019-0969-x

Dobin, A., Davis, C. A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., & Gingeras, T. R. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 29(1), 15–21. https://doi.org/10.1093/BIOINFORMATICS/BTS635

Fan, C., Chen, F., Chen, Y., Huang, L., Wang, M., Liu, Y., Wang, Y., Guo, H., Zheng, N., Liu, Y., Wang, H., & Ma, L. (2024). irGSEA: the integration of single-cell rank-based gene set enrichment analysis. Briefings in Bioinformatics, 25(4). https://doi.org/10.1093/BIB/BBAE243

Hao, Y., Stuart, T., Kowalski, M. H., Choudhary, S., Hoffman, P., Hartman, A., Srivastava, A., Molla, G., Madad, S., Fernandez-Granda, C., & Satija, R. (2024). Dictionary learning for integrative, multimodal and scalable single-cell analysis. Nature Biotechnology, 42(2), 293–304. https://doi.org/10.1038/s41587-023-01767-y

Holland, C. H., Szalai, B., & Saez-Rodriguez, J. (2020). Transfer of regulatory knowledge from human to mouse for functional genomics analysis. Biochimica et Biophysica Acta (BBA) - Gene Regulatory Mechanisms, 1863(6), 194431. https://doi.org/10.1016/J.BBAGRM.2019.194431

Jin, S., Plikus, M. V., & Nie, Q. (2025). CellChat for systematic analysis of cell–cell communication from single-cell transcriptomics. Nature Protocols, 20(1), 180–219. https://doi.org/10.1038/S41596-024-01045-4;SUBJMETA

Kang, M., Gulati, G. S., Brown, E. L., Qi, Z., Avagyan, S., Armenteros, J. J. A., Gleyzer, R., Zhang, W., Steen, C. B., D’Silva, J. P., Schwab, J., Clarke, M. F., Chaudhuri, A. A., & Newman, A. M. (2025). Improved reconstruction of single-cell developmental potential with CytoTRACE 2. Nature Methods 2025 22:11, 22(11), 2258–2263. https://doi.org/10.1038/s41592-025-02857-2

La Manno, G., Soldatov, R., Zeisel, A., Braun, E., Hochgerner, H., Petukhov, V., Lidschreiber, K., Kastriti, M. E., Lönnerberg, P., Furlan, A., Fan, J., Borm, L. E., Liu, Z., van Bruggen, D., Guo, J., He, X., Barker, R., Sundström, E., Castelo-Branco, G., … Kharchenko, P. V. (2018). RNA velocity of single cells. Nature, 560(7719), 494–498. https://doi.org/10.1038/s41586-018-0414-6

McGinnis, C. S., Murrow, L. M., & Gartner, Z. J. (2019). DoubletFinder: Doublet Detection in Single-Cell RNA Sequencing Data Using Artificial Nearest Neighbors. Cell Systems, 8(4), 329-337.e4. https://doi.org/10.1016/j.cels.2019.03.003

Trapnell, C., Cacchiarelli, D., Grimsby, J., Pokharel, P., Li, S., Morse, M., Lennon, N. J., Livak, K. J., Mikkelsen, T. S., & Rinn, J. L. (2014). The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nature Biotechnology 2014 32:4, 32(4), 381–386. https://doi.org/10.1038/nbt.2859

Wu, T., Hu, E., Xu, S., Chen, M., Guo, P., Dai, Z., Feng, T., Zhou, L., Tang, W., Zhan, L., Fu, X., Liu, S., Bo, X., & Yu, G. (2021). clusterProfiler 4.0: A universal enrichment tool for interpreting omics data. The Innovation, 2(3). https://doi.org/10.1016/j.xinn.2021.100141

Zheng, G. X. Y., Terry, J. M., Belgrader, P., Ryvkin, P., Bent, Z. W., Wilson, R., Ziraldo, S. B., Wheeler, T. D., McDermott, G. P., Zhu, J., Gregory, M. T., Shuga, J., Montesclaros, L., Underwood, J. G., Masquelier, D. A., Nishimura, S. Y., Schnall-Levin, M., Wyatt, P. W., Hindson, C. M., … Bielas, J. H. (2017). Massively parallel digital transcriptional profiling of single cells. Nature Communications, 8(1), 14049. https://doi.org/10.1038/ncomms14049

发表于 2026-01-23 13:09
阅读 ( 263 )
分类：转录组

小鼠单细胞转录组方法参考 method

Correlation Analysis of snRNA-Seq and Bulk RNA-Seq

Reference

你可能感兴趣的文章

相关问题

0 条评论

作家榜 »