electrophysiological characteristics, often combined
with molecular markers (1–5). Systematic in situ
hybridization has revealed extensive regional
heterogeneity (6). However, none of these properties carry enough information to result, in every
case, in a definitive cell type identification (7). Single-cell RNA sequencing (RNA-seq) has been used to
classify cells in spleen (8), lung epithelium (9), and
embryonic brain (10). However, the adult nervous
system has greater complexity and more cell types,
presenting a challenge both to sample preparation methods and computational analysis.
Here, we have used quantitative single-cell
RNA-seq (11) to perform a molecular census of
the primary somatosensory cortex (S1) and the
hippocampal CA1 region, based on 3005 single-cell transcriptomes (Fig. 1A and fig. S1, A to C).
Individual RNA molecules were counted using
unique molecular identifiers (UMIs) (essentially
tags that identify individual molecules) (12) (figs.
S1, D to J, and S2, A to E) and confirmed by
single-molecule RNA fluorescence in situ hybridization (FISH) (fig. S2, G to I).
We used clustering to discover molecularly
distinct classes of cells. Standard hierarchical
clustering resulted in fragmented clusters (fig.
S4), because most genes were not informative in
most pairwise comparisons and contributed at
best only noise. Biclustering can overcome this
problem by simultaneously clustering genes and
cells. We developed BackSPIN (see the supple-
mentary materials), a divisive biclustering method
based on sorting points into neighborhoods (SPIN)
(13), which revealed nine major classes of cells: S1
and CA1 pyramidal neurons, interneurons, oligodendrocytes, astrocytes, microglia, vascular endothelial cells, mural cells (that is, pericytes and
vascular smooth muscle cells), and ependymal
cells (Fig. 1, A and B, and fig. S3).
The data set allowed us to identify the most
specific markers for each class, many of which
are known to play a functional role in these cells
(fig. S5). S1 pyramidal cells were marked by Tbr1,
a transcription factor required for the final differentiation of cortical projection neurons; oligodendrocytes by Hapln2, encoding a protein required
for proper formation of nodes of Ranvier; mural
cells by Acta2, a key component of actin thin filaments; and endothelial cells by Ly6c1 [expressed
by monocytes peripherally, and endothelial cells
in the brain (14)]. Some were novel, such as
Gm11549 (a long noncoding RNA specific to S1
pyramidal neurons), Spink8 (a serine protease inhibitor specific to hippocampal pyramidal cells),
and Pnoc (prepronociceptin, here identified as an
By repeating biclustering on each of the nine
major classes (Fig. 1C and figs. S5 to S8), we identified a total of 47 molecularly distinct subclasses
of cells. Every subclass was detected in multiple
mice (fig. S1K), arguing that cell identity was
preserved across these genetically outbred (CD-1)
mice. Neurons contained more RNA than glia
and vascular cells and a larger number of detectable genes (Fig. 1C and fig. S1E). Mitochondrial
mRNAs were less variable, although mitochondrial tRNAs were highly specifically enriched in
endothelial cells (fig. S1E).
We identified seven subclasses of S1 pyramidal
cells (Fig. 2A and figs. S6A and S7), which were
largely layer-specific. The superficial layers II/III
and IV were represented by single populations,
whereas layer V showed two distinct subclasses.
Layers VI and VIb were represented by single
populations, but in addition we found a subclass
lacking specific markers but expressing common
deep-layer markers such as Pcp4. A distinct subclass expressed Synpr and Nr4a2, which are abundant in the adjacent claustrum, with some cells
extending into S1.
We found two types of CA1 glutamatergic cells
(fig. S8), plus cells derived from the adjacent CA2
(as defined by Pcp4) and subiculum (as defined
by Ly6g6e). Genes highly expressed in type 2 CA1
pyramidal neurons were associated with mitochondrial function (fig. S8), which has been shown
to correlate with the firing rate and length of projections in cortical neurons (15). Orthogonal to the
two main classes, we found CA1 layer–specific
markers (i.e., Calb1 and Nov), as well as dorsoventrally patterned genes (i.e., Wfs1 and Grp)
(16), in both of the two main types of CA1 cells.
These may correspond to functional differences
between layers (17).
We found 16 subclasses of interneurons (Fig. 2B
and fig. S6, C and D), but there are likely more
subclasses because we achieved only shallow sampling of Sst- and Pvalb-expressing cells. In superficial
layers of S1, we identified an Htr3a- and Pax6-
expressing interneuron subclass, confirmed by
immunohistochemistry (Fig. 2C) [13.9 T 2.4% of
serotonin (5HT) receptor 3a-enhanced green fluorescent protein (5HT3aEGFP) cells in layer I, n = 4
mice, 636 cells analyzed]. These interneurons specifically expressed Myh8, Fut9, and Manea. In
whole-cell current clamp recordings of layer I
neurons, subsequently stained for PAX6, these
cells exhibited intrinsic electrophysiological and
1Division of Molecular Neurobiology, Department of Medical
Biochemistry and Biophysics, Karolinska Institutet, S-171 77
Stockholm, Sweden. 2Department of Immunology, Genetics
and Pathology, Rudbeck Laboratory, Uppsala University, Dag
Hammarskjölds väg 20, S-751 85 Uppsala, Sweden. 3Division
of Vascular Biology, Department of Medical Biochemistry
and Biophysics, Karolinska Institutet, S-171 77 Stockholm,
Sweden. 4Department of Oncology-Pathology, Karolinska
Institutet, S-171 76 Stockholm, Sweden.
*These authors contributed equally to this work. †Corresponding
author. E-mail: firstname.lastname@example.org (S.L.); jens.hjerling-leffler@
Gad1 Thy1 9 classes
cortex (S1) Whole tissue
Mbp Aldoc Aif1 Cldn5
CA1 Pyramidal Oligodendrocytes Microglia
RNA Cells Genes
0 100 100 10 10 1 2 4
0 2.55 7.5
Fig. 1. Molecular census of somatosensory S1 cortex and hippocampus
CA1 by unbiased sampling and single-cell RNA-seq. (A) Workflow for
obtaining and analyzing single-cell RNA-seq from juvenile mouse cortical cells,
from dissection to single-cell RNA-seq and biclustering. (B) Visualization of
nine major classes of cells using t-distributed stochastic neighbor embedding
(tSNE). Each dot is a single cell, and cells are laid out to show similarities.
Colored contours correspond to the nine clusters in (A) and fig. S3. Expression of known markers is shown using the same layout (blue, no expression; white,
1% quantile; red, 99% quantile). (C) Hierarchical clustering analysis on 47 subclasses. Bar plots show number of captured cells in CA1 and S1, number of
detected polyA+ RNA molecules per cell, and total number of genes detected per cell.