Helmholtz Gemeinschaft

Search
Browse
Statistics
Feeds

Identifying cancer cells from calling single-nucleotide variants in scRNA-seq data

[thumbnail of Accepted Manuscript]
Preview
PDF (Accepted Manuscript) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
6MB
[thumbnail of Supplementary Data]
Preview
PDF (Supplementary Data) - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
14MB

Item Type:Article
Title:Identifying cancer cells from calling single-nucleotide variants in scRNA-seq data
Creators Name:Marot-Lassauzaie, V., Beneyto-Calabuig, S., Obermayer, B., Velten, L., Beule, D. and Haghverdi, L.
Abstract:MOTIVATION: Single cell RNA sequencing (scRNA-seq) data is widely used to study cancer cell states and their heterogeneity. However, the tumour microenvironment is usually a mixture of healthy and cancerous cells and it can be difficult to fully separate these two populations based on transcriptomics alone. If available, somatic single nucleotide variants (SNVs) observed in the scRNA-seq data could be used to identify the cancer population and match that information with the single cells’ expression profile. However, calling somatic SNVs in scRNA-seq data is a challeng-ing task, as most variants seen in the short read data are not somatic, but can instead be germline variants, RNA edits or transcription, sequencing or processing errors. Additionally, only variants present in actively transcribed regions for each individual cell will be seen in the data. RESULTS: To address these challenges, we develop CCLONE (Cancer Cell Labelling On Noisy Expression), an interpretable tool adapted to handle the uncertainty and sparsity of SNVs called from scRNA-seq data. CCLONE jointly identifies cancer clonal populations, and their associated variants. We apply CCLONE on two acute myeloid leukaemia datasets and one lung adenocarcinoma dataset and show that CCLONE captures both genetic clones and somatic events for multiple patients. These results show how CCLONE can be used to gather insight into the course of the disease and the origin of cancer cells in scRNA-seq data. AVAILABILITY AND IMPLEMENTATION: Source code is available at github.com/HaghverdiLab/CCLONE - CONTACT: Laleh.Haghverdi@mdc-berlin.de
Keywords:Algorithms, Lung Neoplasms, Neoplasms, Single Nucleotide Polymorphism, RNA-Seq, RNA Sequence Analysis, Single-Cell Analysis, Single-Cell Gene Expression Analysis, Software
Source:Bioinformatics
ISSN:1367-4803
Publisher:Oxford University Press
Volume:40
Number:9
Page Range:btae512
Date:September 2024
Official Publication:https://doi.org/10.1093/bioinformatics/btae512
PubMed:View item in PubMed

Repository Staff Only: item control page

Downloads

Downloads per month over past year

Open Access
MDC Library