Jump to content

Single-cell multi-omics integration

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Shortytot (talk | contribs) at 00:05, 23 February 2024 (added citations). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Single-cell multi-omics integration describes a suite of computational methods used to harmonize information from multiple "omes" to jointly analyze biological phenomena[1][2][3][4]. This approach allows researchers to discover intricate relationships between different chemical-physical modalities by drawing associations across various molecular layers simultaneously. Multi-omics integration approaches can be categorized into four broad categories: Early integration, intermediate integration, late integration, and mixed integration methods[5]. Multi-omics integration can enhance experimental robustness by providing independent sources of evidence to address hypotheses, leveraging modality-specific strengths to compensate for another's weaknesses through imputation, and offering cell-type clustering and visualizations that are more aligned with reality[1][2].

Background

The emergence of single-cell sequencing technologies has revolutionized our understanding of cellular heterogeneity, uncovering a nuanced landscape of cell types and their associations with biological processes. Single-cell omics technologies has extended beyond the transcriptome to profile diverse physical-chemical properties at single-cell resolution, including whole genomes/exomes, DNA methylation, chromatin accessibility, histone modifications, epitranscriptome (e.g., mRNAs, microRNAs, tRNAs, lncRNAs), proteome, phosphoproteome, metabolome, and more[3][6][7]. In fact, there is an expanding repository of publicly available single-cell datasets, exemplified by growing databases such as the Human Cell Atlas Project (HCA), the Cancer Genome Atlas (TCGA), and the ENCODE project[8][9][10][11][12]. With the increasing diversity in both available datasets and data types, multi-omics data integration and multimodal data analysis represent pivotal trajectories for the future of systems biology.

Single-cell multi-omics integration can reveal underappreciated relationships between chemical-physical modalities, broaden our definition of cell states beyond single modality feature profiles, and provide independent evidence during analysis to support testing of biological hypotheses. However, the high dimensionality (features > observations), high degree of stochastic technical and biological variability, and sparsity of single-cell data (low molecule recovery efficiency) make computational integration a challenging problem[13][14][15][16]. Furthermore, different solutions for multi-omics integration are available depending on factors such as whether the data is matched (simultaneous measurements derived from the same cell) or unmatched (measurements derived from different cells), whether cell-type annotations are available, or whether modality feature conversion is available, with different implementations tailored to suit the specific use case[1]. As such, there are multiple approaches to single-cell data integration, each with a distinct use case, and each with its own set of advantages and disadvantages[1][5][17].

Methodology

Early Integration

Early integration involves concatenating two or more omic datasets (eg. scRNA-seq data and scATAC-seq data) into a single merged data matrix[18][19]. Despite the advantages of simplicity and being able to consider dependencies between features, the inherent nature of concatenating two datasets together results in differing dimensions and scales among features. More importantly, the resulting matrix would become an even higher dimensional dataset (hence dimensionality reduction is often necessary). To mitigate these issues, strategies like feature selection and dimensionality reduction are employed - and as mentioned earlier, is often necessary. Regardless, due to these challenges, early data integration has most commonly been used to concatenate different datasets of the same datatype (eg. Integrating two different scRNA-seq datasets).

Intermediate Integration

Intermediate integration strategies aim to analyze multiple omic datasets at the same time without the need for data transformation prior to analysis[18][19]. The main approaches to doing so include similarity-based integration, joint dimension reduction, and statistical modeling.

Similarity-based integration involves identifying similarities or patterns across multi-omic datasets through the use of spectral clustering (eg. Spectrum[20] and PC-MSC[21]) which cluster cells based on similarity matrices derived from multi-omic datasets or graph fusion algorithms (eg. Seurat4) which construct graphs from individual omics layers and merge them into a single graph[22].

In joint dimension reduction, the aim is to reduce the complexity of the multi-omics data by projecting them into a lower dimensional latent space such that the different omics layers can be compared and analyzed together[23]. Canonical correlation analysis (CCA), non-negative matrix factorization (NMF) and manifold alignment are common methods for doing joint dimensionality reduction. Tools that use CCA and its extension, sparse CCA, such as Seurat3[24] and bindSC[25] identifies linear relationships between datasets by finding linear combinations of variables that maximize their correlations with one another. Tools that use NMF (eg. LIGER[26] and coupledNMF[27]) extracts low-dimensional representations of high-dimensional data such that shared and dataset-specific factors across the multiple omics datasets can be identified. Manifold alignment (eg., MATCHER[28] and MAGAN[29]) refers to an approach where a lower dimension representation of the multi-omic datasets are created individually and then aligned in a common latent space.

Statistical approaches can also be used to integrate information from multi-omic datasets. One well known approach is the Bayesian framework which facilitates probabilistic modeling of the multi-omic datasets. Tools that use a Bayesian clustering framework such as BREM-SC[30] can jointly cluster multi-omic datasets and identify cell clusters. Another tool that uses a Bayesian approach to conduct multi-omic integration is Clonealign which as the name suggests, is able to integrate gene expression and copy number profiles to study cancer clones.

Late Integration

Late integration refers to the straightforward approach of processing and modeling each omics dataset separately, then combining the two models at the very end[18][19]. The advantage of this, lies in the fact that there are well-established tools already designed for each omics modality as different clustering algorithms may be tailored to different omics data types. While late integration approaches have been commonly used in the context of bulk multi-omics studies (eg., Cluster-of-clusters analysis[31] and Kernel Learning Integrative Clustering[32]), late integration in the context of single cell experiments is still a rapidly evolving field. One method of single cell multi-omics late integration known as ensemble clustering (eg. SAME-clustering[33], Sc-GPE[34], EC-PGMGR[35]), have demonstrated promising potential in aggregating clustering results from diverse sources. It combines the clustering results from different omics datasets and creates robust consensus clustering which models the relationships between the individual clustering results to find an improved global clustering solution across the different modalities.

However, while late integration is a good solution to handling single-cell multi-omics datasets, it inherently lacks the capability to capture interactions and relationships between different omics modalities. The whole point of multi-omics integration lies in its ability to effectively analyze the inter-omics relationships present in multi-omics data, enabling us to better understand the underlying biological mechanisms driving disease pathogenesis. Hence, while late integration strategies have their merits, it essentially is just single-omics analysis done on multiple datatypes which is not necessarily multi-omics integration.

Dimensionality Reduction

Main article: Dimensionality Reduction

Dimensionality reduction refers to the transformation of high dimensional data into a lower dimensional dataset. This decrease in dimensionality reduces noise and simplifies the dataset, resulting in easier handling of data. Dimensionality reduction can be conducted using either feature selection or feature extraction. The former takes the original omic layers and retains only the variables that are important while the latter transforms the original input features into combinations of the original features. Dimensionality reduction is often a necessity especially in the context of a high dimensional dataset and if a particular integration strategy requires it (eg. early and intermediate integration).

Example Tools Used in Single-Cell Multi-Omic Integration
Tool Single-Cell Omic Types Github
SCHEMA[36] Multiple Omics (Matched) https://github.com/rs239/schema
Spectrum[20] Mutliple Omics (Unmatched) https://github.com/crj32/spectrum_manuscript
Seurat3[24] Transcriptomics + Chromatin Accessibiltity https://github.com/satijalab/seurat
Seurat4[37] Transcriptomics + Chromatin Accessibility or Proteomics https://github.com/satijalab/seurat/releases/tag/v4.4.0
Seurat5[38] Transcriptomics + Chromatin Accessibility or Proteomics https://github.com/satijalab/seurat
BindSC[25] Transcriptomics + Chromatin Accessibility https://github.com/KChen-lab/bindSC
BREM-SC[30] Transcriptomics + Proteomics https://github.com/tarot0410/BREMSC
CiteFuse[39] Transcriptomics + Proteomics https://github.com/SydneyBioX/CiteFuse
Clonealign[40] Genomics + Transcriptomics https://github.com/kieranrcampbell/clonealign
CoupledNMF[27] Transcriptomics + Chromatin Accessibility https://github.com/SUwonglab/CoupledNMF
LIGER[26] Transcriptomics + Spatial Gene Expression or Genomics https://github.com/welch-lab/liger
MAGAN[29] Multiple Omics (Unmatched) https://github.com/KrishnaswamyLab/MAGAN
MMD-MA[41] Multiple Omics (Matched) https://github.com/google-research/large_scale_mmdma
MOFA+[42] Multiple Omics (Matched) https://github.com/bioFAM/MOFA
scMVAE[43] Multiple Omics (Matched) https://github.com/cmzuo11/scMVAE
TotalVI[44] Transcriptomics + Proteomics https://github.com/YosefLab/totalVI_reproducibility
UniCom[45] Transcriptomics + Metabolomics https://github.com/caokai1073/UnionCom

Considerations of Data Integration

Noise

As single-cell data is prone to noise from both biological and technical sources, developing robust de-noising methods to mitigate noise may be necessary[46]. Especially in the context of single-cell experiments, biological variations such as transcriptional bursts, cell cycle differences and local cell environments can introduce noise to the datasets. Technical variations such as bad sequence quality, uneven coverage and even sample contamination would also need to be addressed.

Dataset Compatibility

Integrating different omic modalities can be challenging due to differences in the inherent structure of the different datasets[47]. For example, scRNA-seq data is continuous while chromatin accessibility data (ie. scATAC-seq) is binary. As such, integration of different modalities may require additional steps to transform the datasets into a common latent space. Even then, integration strategies such as early integration may still be prone to issues of bias if there are more information in one dataset compared to another.

Dimensionality

Main article: High Dimensional Statistics

Analyzing large-scale single-cell multi-omics datasets can be computationally intensive because of the high dimensionality of the datasets. Therefore, the tools used to integrate the datasets need to be able to efficiently handle the high-dimensional datasets or computational methods need to be employed to first reduce the dimensionality of the datasets (see dimensionality reduction).

Interpretability and Validation

Many integration methods focus on statistical associations rather than focusing on detailed causal modeling. As such, interpretability and validation of the results can be challenging particular if a neural network was utilized as they are inherently black boxes[19]. As such the utility and validation of integration methods need to be assessed based on practical applications such as accurately identifying biologically relevant multi-omic relationships.

Matched and Unmatched Data

Integration of single-cell multi-omic data presents different challenges depending on whether the datasets are matched or unmatched[47]. Matched datasets refer to multiple omic layers that are measured from the same individual cell whereas unmatched data refer to dataset that are measured from a different set of cells. While matched datasets enable direct comparisons between the different omics layers within the same cell, they may not be as readily available as unmatched datasets. On the other hand, while unmatched datasets allow for the integration of different sources and conditions, they require considerations of potential biases and confounding factors.

Applications and Uses

While single-modality datasets have proven to be a mainstay in systems biology, combining biological information across multiple modalities has the potential to address biological questions that cannot be inferred by a single data type alone. For example, the integration of transcriptome and DNA accessibility has enabled the development of bioinformatic tools to infer cell-type-specific gene regulatory networks[48][49][50]. This is achieved by leveraging transcription factor and target gene expression along with cis-regulatory information to impute relevant transcription factors and their regulatory partners. Another application for multi omics integration is in expanding definitions of cell states incorporating features observed across multiple modalities. For instance, integrating protein marker detection with transcriptome profiling using a multi-omics sequencing technology such as CITE-seq can resolve cell state signatures based on joint gene regulatory and surface marker expression[51]. This enables more robust inferences regarding cellular phenotypes, which are akin to and directly comparable with results from classical flow cytometry. Moreover, defining cell states based on clustering analysis within an integrated latent space may offer more stable estimations of cellular phenotypes compared to analysis within a single-modality latent space[1]. Furthermore, multi omics integration can overcome modality-specific limitations. For example, most spatial transcriptomic sequencing technologies suffer from limited spatial resolution (pixels comprising a mixture of local cells) and low feature complexity[52]. Integration of spatial transcriptomics with scRNAseq can help overcome these limitations by supporting the spatial deconvolution of low-resolution readouts and estimating the frequencies of each cell type[53][54].

References

  1. ^ a b c d e Miao, Zhen; Humphreys, Benjamin D; McMahon, Andrew P; Kim, Junhyong (2021). "Multi-omics integration in the age of million single-cell data". Nat Rev Nephrol. 17 (11): 710–724. doi:10.1038/s41581-021-00463-x. PMC 9191639. PMID 34417589.
  2. ^ a b Subramanian, Indhupriya (2020). "Multi-omics Data Integration, Interpretation, and Its Application". Bioinform Biol Insights. 14. doi:10.1177/1177932219899051. PMID 32076369.
  3. ^ a b Stuart, Tim; Sajita, Rahul (2019). "Integrative single-cell analysis". Nat Rev Genet. 20 (5): 257–272. doi:10.1038/s41576-019-0093-7. PMID 30696980. S2CID 59409752.
  4. ^ Li, Yunjin; Ma, Lu; Wu, Duojiao; Chen, Geng (2021). "Advances in bulk and single-cell multi-omics approaches for systems biology and precision medicine". Brief Bioinform. 22 (5). doi:10.1093/bib/bbab024. PMID 33778867.
  5. ^ a b Adossa, Nigatu; Khan, Sofia; Rytkönen, Kalle T; Elo, Laura L (2021). "Computational strategies for single-cell multi-omics integration". Comput Struct Biotechnol J. 19: 2588-2596. doi:10.1016/j.csbj.2021.04.060.
  6. ^ Baysoy, Alev; Bai, Zhiliang; Satija, Rahul; Fan, Rong (2024). "The technological landscape and applications of single-cell multi-omics". Nat Rev Mol Cell Biol. 24 (10): 695–713. doi:10.1038/s41580-023-00615-w. PMC 10242609. PMID 37280296.
  7. ^ Macaulay, Iain C; Ponting, Chris P; Voet, Thierry (2017). "Single-Cell Multiomics: Multiple Measurements from Single Cells". Trends Genet. 33 (2): 155-168. doi:10.1016/j.tig.2016.12.003.
  8. ^ Regev, Aviv; Teichmann, Sarah A; Lander, Eric S; Amit, Ido; Benoist, Christophe; Birney, Ewan; Bodenmiller, Bernd; Campbell, Peter; Carninci, Piero; Clatworthy, Menna; Clevers, Hans; Deplancke, Bart; Dunham, Ian; Eberwine, James; Eils, Roland; Enard, Wolfgang; Farmer, Andrew; Fugger, Lars; Göttgens, Berthold; Hacohen, Nir; Haniffa, Muzlifah; Hemberg, Martin; Kim, Seung; Klenerman, Paul; Kriegstein, Arnold; Lein, Ed; Linnarsson, Sten; Lundberg, Emma; Lundeberg, Joakim; Majumder, Partha; Marioni, John C; Merad, Miriam; Mhlanga, Musa; Nawijn, Martijn; Netea, Mihai; Nolan, Garry; Pe'er, Dana; Phillipakis, Anthony; Ponting, Chris P; Quake, Stephen; Reik, Wolf; Rozenblatt-Rosen, Orit; Sanes, Joshua; Satjia, Rahul; Schumacher, Ton N; Shalek, Alex; Shapiro, Ehud; Sharma, Padmanee; Shin, Jay W; Stegle, Oliver; Stratton, Michael; Stubbington, Michael J T; Theis, Fabian J; Uhlen, Matthias; Van Oudenaarden, Alexander; Wagner, Allon; Watt, Fiona; Weissman, Jonathan; Wold, Barbara; Xavier, Ramnik; Yosef, Nir (2017). "The Human Cell Atlas". eLife. 6. doi:10.7554/eLife.27041. PMC 5762154. PMID 29206104.
  9. ^ Lindeboom, Rik G.H; Regev, Aviv; Teichmann, Sarah A (2021). "Towards a Human Cell Atlas: Taking Notes from the Past". Trends Genet. 37 (7): 625–630. doi:10.1016/j.tig.2021.03.007. PMID 33879355.
  10. ^ Weinstein, John N; Collisson, Eric A; Mills, Gordon B; Shaw, Kenna R Mills; Ozenberger, Brad A; Ellrott, Kyle; Shmulevich, Ilya; Sander, Chris; Stuart, Joshua M (2013). "The Cancer Genome Atlas Pan-Cancer analysis project". Nat Genet. 45 (10): 1113-1120. doi:10.1038/ng.2764. PMID 24071849.
  11. ^ The ENCODE Project Consortium (2012). "An integrated encyclopedia of DNA elements in the human genome". Nature. 489 (7414): 57–74. doi:10.1038/nature11247. PMC 3439153. PMID 22955616.
  12. ^ The ENCODE Project Consortium (2020). "Expanded encyclopaedias of DNA elements in the human and mouse genomes". Nature. 583 (7818): 699-710. doi:10.1038/s41586-020-2493-4. PMID 32728249.
  13. ^ Lähnemann, David; Köster, Johannes; Szczurek, Ewa; McCarthy, Davis J; Hicks, Stephanie C; Robinson, Mark D; Vallejos, Catalina A; Campbell, Kieran R; Beerenwinkel, Niko; Mahfouz, Ahmed; Pinello, Luca; Skums, Pavel; Stamatakis, Alexandros; Attolini, Camille Stephan-Otto; Aparicio, Samuel; Baaijens, Jasmijn; Balvert, Marleen; Barbanson, Buys De; Cappuccio, Antonio; Corleone, Giacomo; Dutilh, Bas E; Florescu, Maria; Guryev, Victor; Holmer, Rens; Jahn, Katharina; Lobo, Thamar Jessurun; Keizer, Emma M; Khatri, Indu; Kielbasa, Szymon M; Korbel, Jan O; Kozlov, Alexey M; Kuo, Tzu-Hao; Lelieveldt, Boudewijn P.F; Mandoiu, Ion I; Marioni, John C; Marschall, Tobias; Mölder, Felix; Niknejad, Amir; Rączkowska, Alicja; Reinders, Marcel; Ridder, Jeroen De; Saliba, Antoine-Emmanuel; Somarakis, Antonios; Stegle, Oliver; Theis, Fabian J; Yang, Huan; Zelikovsky, Alex; McHardy, Alice C; Raphael, Benjamin J; Shah, Sohrab P; Schönhuth, Alexander (2020). "Eleven grand challenges in single-cell data science". Genome Biol. 21 (1): 31. doi:10.1186/s13059-020-1926-6. PMC 7007675. PMID 32033589.
  14. ^ Santiago-Rodriguez, Tasha M; Hollister, Emily B (2021). "Multi 'omic data integration: A review of concepts, considerations, and approaches". Semin Perinatol. 45 (6). doi:10.1016/j.semperi.2021.151456. PMID 34256961. S2CID 235822759.
  15. ^ Yuan, Guo-Cheng; Cai, Long; Elowitz, Michael; Enver, Tariq; Fan, Guoping; Guo, Guoji; Irizarry, Rafael; Kharchenko, Peter; Kim, Junhyong; Orkin, Stuart; Quackenbush, John; Saadatpour, Assieh; Schroeder, Timm; Shivdasani, Ramesh; Tirosh, Itay (2017). "Challenges and emerging directions in single-cell analysis". Genome Biol. 18 (1): 84. doi:10.1186/s13059-017-1218-y. PMC 5421338. PMID 28482897.
  16. ^ Argelaguet, RICARD; Cuomo, Anna S. E; Stegle, Oliver; Marioni, John C (2021). "Computational principles and challenges in single-cell data integration". Nat Biotechnol. 39 (10): 1202-1215. doi:10.1038/s41587-021-00895-7. PMID 33941931. S2CID 233722751.
  17. ^ Wu, Yan; Zhang, Kun (2020). "Tools for the analysis of high-dimensional single-cell RNA sequencing data". Nat Rev Nephrol. 16 (7): 408-421. doi:10.1038/s41581-020-0262-0. S2CID 214672522.
  18. ^ a b c Adossa, Nigatu; Khan, Sofia; Rytkönen, Kalle T.; Elo, Laura L. (2021). "Computational strategies for single-cell multi-omics integration". Computational and Structural Biotechnology Journal. 19: 2588–2596. doi:10.1016/j.csbj.2021.04.060. PMC 8114078. PMID 34025945.{{cite journal}}: CS1 maint: PMC format (link)
  19. ^ a b c d Picard, Milan; Scott-Boyer, Marie-Pier; Bodein, Antoine; Périn, Olivier; Droit, Arnaud (2021). "Integration strategies of multi-omics data for machine learning analysis". Computational and Structural Biotechnology Journal. 19: 3735–3746. doi:10.1016/j.csbj.2021.06.030. ISSN 2001-0370. PMC 8258788. PMID 34285775.{{cite journal}}: CS1 maint: PMC format (link)
  20. ^ a b John, Christopher R; Watson, David; Barnes, Michael R; Pitzalis, Costantino; Lewis, Myles J (2019-09-10). "Spectrum: fast density-aware spectral clustering for single and multi-omic data". Bioinformatics. 36 (4): 1159–1166. doi:10.1093/bioinformatics/btz704. ISSN 1367-4803. PMC 7703791. PMID 31501851.{{cite journal}}: CS1 maint: PMC format (link)
  21. ^ Kumar, Abhishek; Rai, Piyush; Daume, Hal (2011). "Co-regularized Multi-view Spectral Clustering". Advances in Neural Information Processing Systems. 24. Curran Associates, Inc.
  22. ^ Wang, Bo; Mezlini, Aziz M; Demir, Feyyaz; Fiume, Marc; Tu, Zhuowen; Brudno, Michael; Haibe-Kains, Benjamin; Goldenberg, Anna (2014-03). "Similarity network fusion for aggregating data types on a genomic scale". Nature Methods. 11 (3): 333–337. doi:10.1038/nmeth.2810. ISSN 1548-7091. {{cite journal}}: Check date values in: |date= (help)
  23. ^ Cantini, Laura; Zakeri, Pooya; Hernandez, Celine; Naldi, Aurelien; Thieffry, Denis; Remy, Elisabeth; Baudot, Anaïs (2021-01-05). "Benchmarking joint multi-omics dimensionality reduction approaches for the study of cancer". Nature Communications. 12 (1): 124. doi:10.1038/s41467-020-20430-7. ISSN 2041-1723. PMC 7785750. PMID 33402734.{{cite journal}}: CS1 maint: PMC format (link)
  24. ^ a b Stuart, Tim; Butler, Andrew; Hoffman, Paul; Hafemeister, Christoph; Papalexi, Efthymia; Mauck, William M.; Hao, Yuhan; Stoeckius, Marlon; Smibert, Peter; Satija, Rahul (2019-06). "Comprehensive Integration of Single-Cell Data". Cell. 177 (7): 1888–1902.e21. doi:10.1016/j.cell.2019.05.031. ISSN 0092-8674. PMC 6687398. PMID 31178118. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link)
  25. ^ a b Dou, Jinzhuang; Liang, Shaoheng; Mohanty, Vakul; Miao, Qi; Huang, Yuefan; Liang, Qingnan; Cheng, Xuesen; Kim, Sangbae; Choi, Jongsu; Li, Yumei; Li, Li; Daher, May; Basar, Rafet; Rezvani, Katayoun; Chen, Rui (2022-05-09). "Bi-order multimodal integration of single-cell data". Genome Biology. 23 (1): 112. doi:10.1186/s13059-022-02679-x. ISSN 1474-760X. PMC 9082907. PMID 35534898.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  26. ^ a b Welch, Joshua D.; Kozareva, Velina; Ferreira, Ashley; Vanderburg, Charles; Martin, Carly; Macosko, Evan Z. (2019-06). "Single-Cell Multi-omic Integration Compares and Contrasts Features of Brain Cell Identity". Cell. 177 (7): 1873–1887.e17. doi:10.1016/j.cell.2019.05.006. ISSN 0092-8674. PMC 6716797. PMID 31178122. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link)
  27. ^ a b Duren, Zhana; Chen, Xi; Zamanighomi, Mahdi; Zeng, Wanwen; Satpathy, Ansuman T.; Chang, Howard Y.; Wang, Yong; Wong, Wing Hung (2018-07-24). "Integrative analysis of single-cell genomics data by coupled nonnegative matrix factorizations". Proceedings of the National Academy of Sciences. 115 (30): 7723–7728. doi:10.1073/pnas.1805681115. ISSN 0027-8424. PMC 6065048. PMID 29987051.{{cite journal}}: CS1 maint: PMC format (link)
  28. ^ Welch, Joshua D.; Hartemink, Alexander J.; Prins, Jan F. (2017-07-24). "MATCHER: manifold alignment reveals correspondence between single cell transcriptome and epigenome dynamics". Genome Biology. 18 (1): 138. doi:10.1186/s13059-017-1269-0. ISSN 1474-760X. PMC 5525279. PMID 28738873.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  29. ^ a b Amodio, Matthew; Krishnaswamy, Smita (2018-02-09), MAGAN: Aligning Biological Manifolds, doi:10.48550/arXiv.1803.00385, retrieved 2024-02-22
  30. ^ a b Wang, Xinjun; Sun, Zhe; Zhang, Yanfu; Xu, Zhongli; Xin, Hongyi; Huang, Heng; Duerr, Richard H; Chen, Kong; Ding, Ying; Chen, Wei (2020-05-07). "BREM-SC: a bayesian random effects mixture model for joint clustering single cell multi-omics data". Nucleic Acids Research. 48 (11): 5814–5824. doi:10.1093/nar/gkaa314. ISSN 0305-1048. PMC 7293045. PMID 32379315.{{cite journal}}: CS1 maint: PMC format (link)
  31. ^ OSBREAC; Aure, Miriam Ragle; Vitelli, Valeria; Jernström, Sandra; Kumar, Surendra; Krohn, Marit; Due, Eldri U.; Haukaas, Tonje Husby; Leivonen, Suvi-Katri; Vollan, Hans Kristian Moen; Lüders, Torben; Rødland, Einar; Vaske, Charles J.; Zhao, Wei; Møller, Elen K. (2017-12). "Integrative clustering reveals a novel split in the luminal A subtype of breast cancer with impact on outcome". Breast Cancer Research. 19 (1). doi:10.1186/s13058-017-0812-y. ISSN 1465-542X. PMC 5372339. PMID 28356166. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  32. ^ Cabassi, Alessandra; Kirk, Paul D (27 June 2020). "Multiple kernel learning for integrative consensus clustering of omic datasets". academic.oup.com. doi:10.1093/bioinformatics/btaa593. PMC 7750932. PMID 32592464. Retrieved 2024-02-22.{{cite web}}: CS1 maint: PMC format (link)
  33. ^ academic.oup.com. doi:10.1093/nar/gkz959. PMC 6943136. PMID 31777938 https://academic.oup.com/nar/article/48/1/86/5644992. Retrieved 2024-02-22. {{cite web}}: Missing or empty |title= (help)CS1 maint: PMC format (link)
  34. ^ Zhu, Xiaoshu; Li, Jian; Li, Hong-Dong; Xie, Miao; Wang, Jianxin (2020-12-15). "Sc-GPE: A Graph Partitioning-Based Cluster Ensemble Method for Single-Cell". Frontiers in Genetics. 11. doi:10.3389/fgene.2020.604790. ISSN 1664-8021. PMC 7770236. PMID 33384718.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  35. ^ Zhu, Yuan; Zhang, De-Xin; Zhang, Xiao-Fei; Yi, Ming; Ou-Yang, Le; Wu, Mengyun (2020). "EC-PGMGR: Ensemble Clustering Based on Probability Graphical Model With Graph Regularization for Single-Cell RNA-seq Data". Frontiers in Genetics. 11. doi:10.3389/fgene.2020.572242. ISSN 1664-8021. PMC 7673820. PMID 33329710.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  36. ^ Singh, Rohit; Hie, Brian L.; Narayan, Ashwin; Berger, Bonnie (2021-05-03). "Schema: metric learning enables interpretable synthesis of heterogeneous single-cell modalities". Genome Biology. 22 (1): 131. doi:10.1186/s13059-021-02313-2. ISSN 1474-760X. PMC 8091541. PMID 33941239.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  37. ^ Hao, Yuhan; Hao, Stephanie; Andersen-Nissen, Erica; Mauck, William M.; Zheng, Shiwei; Butler, Andrew; Lee, Maddie J.; Wilk, Aaron J.; Darby, Charlotte; Zager, Michael; Hoffman, Paul; Stoeckius, Marlon; Papalexi, Efthymia; Mimitou, Eleni P.; Jain, Jaison (2021-06). "Integrated analysis of multimodal single-cell data". Cell. 184 (13): 3573–3587.e29. doi:10.1016/j.cell.2021.04.048. ISSN 0092-8674. PMC 8238499. PMID 34062119. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link)
  38. ^ Hao, Yuhan; Stuart, Tim; Kowalski, Madeline H.; Choudhary, Saket; Hoffman, Paul; Hartman, Austin; Srivastava, Avi; Molla, Gesmira; Madad, Shaista; Fernandez-Granda, Carlos; Satija, Rahul (2024-02). "Dictionary learning for integrative, multimodal and scalable single-cell analysis". Nature Biotechnology. 42 (2): 293–304. doi:10.1038/s41587-023-01767-y. ISSN 1546-1696. {{cite journal}}: Check date values in: |date= (help)
  39. ^ academic.oup.com. doi:10.1093/bioinformatics/btaa282 https://academic.oup.com/bioinformatics/article/36/14/4137/5827474. Retrieved 2024-02-22. {{cite web}}: Missing or empty |title= (help)
  40. ^ Campbell, Kieran R.; Steif, Adi; Laks, Emma; Zahn, Hans; Lai, Daniel; McPherson, Andrew; Farahani, Hossein; Kabeer, Farhia; O’Flanagan, Ciara; Biele, Justina; Brimhall, Jazmine; Wang, Beixi; Walters, Pascale; Consortium, IMAXT; Bouchard-Côté, Alexandre (2019-03-12). "clonealign: statistical integration of independent single-cell RNA and DNA sequencing data from human cancers". Genome Biology. 20 (1): 54. doi:10.1186/s13059-019-1645-z. ISSN 1474-760X. PMC 6417140. PMID 30866997.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  41. ^ Liu, Jie; Huang, Yuanhao; Singh, Ritambhara; Vert, Jean-Philippe; Noble, William Stafford (2019). "Jointly Embedding Multiple Single-Cell Omics Measurements". Katharina T. Huber, Dan Gusfield: 13 pages, 3000894 bytes. doi:10.4230/LIPICS.WABI.2019.10. ISSN 1868-8969. {{cite journal}}: Cite journal requires |journal= (help)CS1 maint: unflagged free DOI (link)
  42. ^ Argelaguet, Ricard; Arnol, Damien; Bredikhin, Danila; Deloro, Yonatan; Velten, Britta; Marioni, John C.; Stegle, Oliver (2020-05-11). "MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data". Genome Biology. 21 (1): 111. doi:10.1186/s13059-020-02015-1. ISSN 1474-760X. PMC 7212577. PMID 32393329.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  43. ^ Zuo, Chunman; Chen, Luonan (2020-11-17). "Deep-joint-learning analysis model of single cell transcriptome and open chromatin accessibility data". Briefings in Bioinformatics. 22 (4). doi:10.1093/bib/bbaa287. ISSN 1467-5463. PMC 8293818. PMID 33200787.{{cite journal}}: CS1 maint: PMC format (link)
  44. ^ Gayoso, Adam; Steier, Zoë; Lopez, Romain; Regier, Jeffrey; Nazor, Kristopher L.; Streets, Aaron; Yosef, Nir (2021-03). "Joint probabilistic modeling of single-cell multi-omic data with totalVI". Nature Methods. 18 (3): 272–282. doi:10.1038/s41592-020-01050-x. ISSN 1548-7105. PMC 7954949. PMID 33589839. {{cite journal}}: Check date values in: |date= (help)CS1 maint: PMC format (link)
  45. ^ Cao, Kai; Bai, Xiangqi; Hong, Yiguang; Wan, Lin (2020-07-01). "Unsupervised topological alignment for single-cell multi-omics integration". Bioinformatics. 36 (Supplement_1): i48 – i56. doi:10.1093/bioinformatics/btaa443. ISSN 1367-4803. PMC 7355262. PMID 32657382.{{cite journal}}: CS1 maint: PMC format (link)
  46. ^ Janssen, Philipp; Kliesmete, Zane; Vieth, Beate; Adiconis, Xian; Simmons, Sean; Marshall, Jamie; McCabe, Cristin; Heyn, Holger; Levin, Joshua Z.; Enard, Wolfgang; Hellmann, Ines (2023-06-19). "The effect of background noise and its removal on the analysis of single-cell expression data". Genome Biology. 24 (1): 140. doi:10.1186/s13059-023-02978-x. ISSN 1474-760X. PMC 10278251. PMID 37337297.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  47. ^ a b Argelaguet, Ricard; Cuomo, Anna S. E.; Stegle, Oliver; Marioni, John C. (2021-10). "Computational principles and challenges in single-cell data integration". Nature Biotechnology. 39 (10): 1202–1215. doi:10.1038/s41587-021-00895-7. ISSN 1546-1696. {{cite journal}}: Check date values in: |date= (help)
  48. ^ Kim, Daniel; Tran, Andy; Kim, Hani Jieun; Lin, Yingxin; Yang, Jean Yee Hwa; Yang, Pengyi (2023). "Gene regulatory network reconstruction: harnessing the power of single-cell multi-omic data". npj Syst Biol Appl. 9 (1): 51. doi:10.1038/s41540-023-00312-6. PMC 10587078. PMID 37857632.
  49. ^ Bravo González-Blas, Carmen; De Winter, Seppe; Hulselmans, Gert; Hecker, Nikolai; Matetovici, Irina; Christiaens, Valerie; Poovathingal, Suresh; Wouters, Jasper; Aibar, Sara; Aerts, Stein (2023). "SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks". Nat Methods. 20 (9): 1355–1367. doi:10.1038/s41592-023-01938-4. PMC 10482700. PMID 37443338.
  50. ^ Fleck, Jonas Simon; Jansen, Sophie Martina Johanna; Whollny, Damian; Zenk, Fides; Seimiya, Makiko; Jain, Akanksha; Okamoto, Ryoko; Santel, Malgorzata; He, Zhisong; Camp, J. Gray; Treutlein, Barbara (2023). "Inferring and perturbing cell fate regulomes in human brain organoids". Nature. 621 (7978): 675-372. doi:10.1038/s41586-022-05279-8. PMID 36198796.
  51. ^ Stoeckius, Marlon; Hafemeister, Christoph; Stephenson, William; Houck-Loomis, Brian; Chattopadhyay, Pratip K; Swerdlow, Harold; Sajita, Rahul; Smibert, Peter (2017). "Simultaneous epitope and transcriptome measurement in single cells". Nat Methods. 14 (9): 865–868. doi:10.1038/nmeth.4380. PMC 5669064. PMID 28759029.
  52. ^ Atta, Lyla; Fan, Jean (2021). "Computational challenges and opportunities in spatially resolved transcriptomic data analysis". Nat Commun. 12 (1): 5283. doi:10.1038/s41467-021-25557-9. PMC 8421472. PMID 34489425.
  53. ^ Andersson, Alma; Bergenstråhle, Joseph; Asp, Michaela; Jurek, Aleksandra; Fernández Navarro, José; Lundeberg, Joakim (2020). "Single-cell and spatial transcriptomics enables probabilistic inference of cell type topography". Commun Biol. 3 (1): 565. doi:10.1038/s42003-020-01247-y. PMID 33037292.
  54. ^ Ma, Ying; Zhou, Xiang (2022). "Spatially informed cell-type deconvolution for spatial transcriptomics". Nat Biotechnol. 40 (9): 1349-1359. doi:10.1038/s41587-022-01273-7. PMID 35501392.