Motivation: Single-cell RNA sequencing has been proved to be revolutionary for its potential of zooming into complex biological systems. Genome-wide expression analysis at single-cell resolution provides a window into dynamics of cellular phenotypes. This facilitates the characterization of transcriptional heterogeneity in normal and diseased tissues under various conditions. It also sheds light on the development or emergence of specific cell populations and phenotypes. However, owing to the paucity of input RNA, a typical single cell RNA sequencing data features a high number of dropout events where transcripts fail to get amplified.
Results: We introduce mcImpute, a low-rank matrix completion based technique to impute dropouts in single cell expression data. On a number of real datasets, application of mcImpute yields significant improvements in the separation of true zeros from dropouts, cell-clustering, differential expression analysis, cell type separability, the performance of dimensionality reduction techniques for cell visualization, and gene distribution.