Population-level analysis of gut
Gwen Falony,1,2 Marie Joossens,1,2,3 Sara Vieira-Silva,1,2 Jun Wang,1,2 Youssef Darzi,1,2,3
Karoline Faust,1,2,3 Alexander Kurilshikov,4,5 Marc Jan Bonder,6 Mireia Valles-Colomer,1,2
Doris Vandeputte,1,2,3 Raul Y. Tito,1,2,3 Samuel Chaffron,1,2,3 Leen Rymenans,1,2,3
Chloë Verspecht,1,2 Lise De Sutter,1,2,3 Gipsi Lima-Mendez,1,2 Kevin D’hoe,1,2,3 Karl Jonckheere,2,3
Daniel Homola,2,3† Roberto Garcia,2,3 Ettje F. Tigchelaar,6,7 Linda Eeckhaudt,2,3 Jingyuan Fu,6,8
Liesbet Henckaerts,1,9 Alexandra Zhernakova,6,7 Cisca Wijmenga,6 Jeroen Raes1,2,3‡
Fecal microbiome variation in the average, healthy population has remained under-investigated. Here, we analyzed two independent, extensively phenotyped cohorts: the
Belgian Flemish Gut Flora Project (FGFP; discovery cohort; N = 1106) and the Dutch
LifeLines-DEEP study (LLDeep; replication; N = 1135). Integration with global data sets
(N combined = 3948) revealed a 14-genera core microbiota, but the 664 identified
genera still underexplore total gut diversity. Sixty-nine clinical and questionnaire-based
covariates were found associated to microbiota compositional variation with a 92%
replication rate. Stool consistency showed the largest effect size, whereas medication
explained largest total variance and interacted with other covariate-microbiota
associations. Early-life events such as birth mode were not reflected in adult microbiota
composition. Finally, we found that proposed disease marker genera associated to
host covariates, urging inclusion of the latter in study design.
Sequencing-based assessment of microbial communities in human fecal material has linked alterations in gut microbiota com- position to disease, as well as chronically suboptimal health and well-being (1–3).
The discovery of these associations has stimulated the search for specific microbiome-based
biomarkers for a wide range of pathologies (4–9).
However, major challenges still hamper the once
assumed imminent translation of microbiome
monitoring into diagnostic and clinical practice.
One such hurdle is the lack of knowledge about
the impact of host and environmental factors on
microbiota variation within an average, healthy
population. Such information is essential for
robust disease marker identification in clinical
metagenomics (10). To identify and character-
ize major microbiome-associated variables, the
Flemish Gut Flora Project (FGFP) initiated a
large-scale cross-sectional fecal sampling effort in
a confined geographic region (Flanders, Belgium).
FGFP collection protocols combined rigorous
sampling logistics, including frozen sample col-
lection and cold chain monitoring, with exhaus-
tive phenotyping through online questionnaires,
standardized anamnesis and health assessment
by general medical practitioners (GPs), and ex-
tended clinical blood profiling (fig. S1). Encom-
passing an equilibrated range of age, gender,
health, and lifestyle, the FGFP cohort is expected
to be representative for the average gut micro-
biota composition in a Western European pop-
ulation (table S1). From this cohort, fecal samples
of 1106 individuals (98.5% of Western or Eastern
European ethnicity; 96.8% born in Belgium) with
time-matched blood and questionnaire data
were analyzed. Microbiome phylogenetic pro-
filing was performed using 16S ribosomal RNA
(rRNA) gene amplicon sequencing. In addition, a
Dutch cohort (N = 1135, LifeLines-DEEP, LLDeep;
1KU Leuven–University of Leuven, Department of Microbiology
and Immunology, Leuven, Belgium. 2VIB, Center for the
Biology of Disease, Leuven, Belgium. 3Vrije Universiteit Brussel,
Faculty of Sciences and Bioengineering Sciences, Microbiology
Unit, Brussels, Belgium. 4Institute of Chemical Biology and
Fundamental Medicine SB RAS, Novosibirsk, Russia.
5Novosibirsk State University, Novosibirsk, Russia. 6University
of Groningen, University Medical Center Groningen,
Department of Genetics, 9700 RB Groningen, Netherlands.
7Top Institute Food and Nutrition, Wageningen, Netherlands.
8University of Groningen, University Medical Center Groningen,
Department of Pediatrics, 9700 RB Groningen, Netherlands.
9KU Leuven–University Hospitals Leuven, Department of
General Internal Medicine, Leuven, Belgium.
*These authors contributed equally to this work. †Present address:
Computation and Systems Medicine Section, Department of
Surgery and Cancer, Imperial College London, London, UK.
‡Corresponding author. Email: firstname.lastname@example.org
Fig. 1. Microbial community variation in the FGFP cohort, represented by principal coordinates analysis (PCoA, genus-level Bray-Curtis dissimilarity).
(A) Top 10 contributors to community variation as determined by canonical correspondence analysis on unscaled genera abundances, plotted on the two first
PCoA dimensions (arrows scaled to contribution). (B) FGFP sample density on the PCoA plot; arrows indicate density peaks enriched in the three previously
proposed enterotype drivers: Prevotella, Bacteroides, and Ruminococcaceae genera.