Discovery of rare cells from voluminous single cell expression data

Aashi Jindal, Prashant Gupta, Jayadeva, and Debarka Sengupta, Nature Communications (2018).


Single cell messenger RNA sequencing (scRNA-seq) provides a window into transcriptional landscapes in complex tissues. The recent introduction of droplet based transcriptomics platforms has enabled the parallel screening of thousands of cells. Large-scale single cell transcriptomics is advantageous as it promises the discovery of a number of rare cell sub-populations. Existing algorithms to find rare cells scale unbearably slowly or terminate, as the sample size grows to the order of tens of thousands. We propose Finder of Rare Entities (FiRE), an algorithm that, in a matter of seconds, assigns a rareness score to every individual expression profile under study. We demonstrate how FiRE scores can help bioinformaticians focus the downstream analyses only on a fraction of expression profiles within ultra-large scRNA-seq data. When applied to a large scRNA-seq dataset of mouse brain cells, FiRE recovered a novel sub-type of the pars tuberalis lineage.