May Pancreatic progenitor epigenome maps prioritize type 2 diabetes risk genes with roles in development Ryan J. Geusz 0 1 2 4 5 7 8 Allen Wang 1 2 4 5 7 8 Joshua Chiou 0 2 4 5 7 Joseph J. Lancman 3 4 5 7 Nichole 4 5 7 Wetton 1 2 4 5 7 8 Samy Kefalopoulou 1 2 4 5 7 8 Jinzhao Wang 1 2 4 5 7 8 Yunjiang Qiu 1 4 5 7 Jian Yan 1 4 5 7 Anthony Aylward 2 4 5 7 Bing Ren 1 4 5 6 7 P Duc Si Dong 3 4 5 7 Kyle J. Gaulton kgaulton@ucsd.edu 2 4 5 7 Maike Sander masander@ucsd.edu 1 2 4 5 7 8 Biomedical Graduate Studies Program, University of California San Diego, La Jolla CA , USA Department of Cellular & Molecular Medicine, University of California , San Diego, La , USA Department of Pediatrics, Pediatric Diabetes Research Center, University of California , USA Human Genetics Program, Sanford Burnham Prebys Medical Discovery Institute , La Jolla , CA 92037 , USA Jolla , CA 92093 , USA Ludwig Institute for Cancer Research , La Jolla, CA 92093-0653 , USA San Diego , La Jolla, CA 92093 , USA Sanford Consortium for Regenerative Medicine , San Diego, La Jolla, CA 92093 , USA 2020 19 2020 37 78 *These authors contributed equally LAMA1 laminin CRB2 CRISPR pancreas development beta cell hESC -

Genetic variants associated with type 2 diabetes (T2D) risk affect gene regulation in metabolically relevant tissues, such as pancreatic islets. Here, we investigated contributions of regulatory programs active during pancreatic development to T2D risk. Generation of chromatin maps from developmental precursors throughout pancreatic differentiation of human embryonic stem cells (hESCs) identifies enrichment of T2D variants in pancreatic progenitor-specific stretch enhancers that are not active in islets. Genes associated with progenitor-specific stretch enhancers are predicted to regulate developmental processes, most notably tissue morphogenesis. Through gene editing in hESCs, we demonstrate that progenitor-specific enhancers harboring T2D-associated variants regulate cell polarity genes LAMA1 and CRB2. Knockdown of lama1 or crb2 in zebrafish embryos causes a defect in pancreas morphogenesis and impairs islet cell development. Together, our findings reveal that a subset of T2D risk variants specifically affects pancreatic developmental programs, suggesting that dysregulation of developmental processes can predispose to T2D.

INTRODUCTION

Type 2 diabetes (T2D) is a multifactorial metabolic disorder characterized by insulin insensitivity and insufficient insulin secretion by pancreatic beta cells (Halban et al., 2014). Genetic association studies have identified hundreds of loci influencing risk of T2D (Mahajan et al., 2018). However, disease-relevant target genes of T2D risk variants, the mechanisms by which these genes cause disease, and the tissues in which the genes mediate their effects remain poorly understood.

The majority of T2D risk variants map to non-coding sequence, suggesting that genetic risk of T2D is largely mediated through variants affecting transcriptional regulatory activity. Intersection of T2D risk variants with epigenomic data has uncovered enrichment of T2D risk variants in regulatory sites active in specific cell types, predominantly in pancreatic beta cells, including risk variants that affect regulatory activity directly (Chiou et al., 2019; Fuchsberger et al., 2016; Gaulton et al., 2015; Gaulton et al., 2010; Greenwald et al., 2019; Mahajan et al., 2018; Parker et al., 2013; Pasquali et al., 2014; Thurner et al., 2018; Varshney et al., 2017) . T2D risk-associated variants are further enriched within large, contiguous regions of islet active chromatin, referred to as stretch or super-enhancers (Parker et al., 2013). These regions of active chromatin preferentially bind islet cell-restricted transcription factors and drive islet-specific gene expression (Parker et al., 2013; Pasquali et al., 2014).

Many genes associated with T2D risk in islets are not uniquely expressed in differentiated islet endocrine cells, but also in pancreatic progenitor cells during embryonic development. For example, T2D risk variants map to HNF1A, HNF1B, HNF4A, MNX1, NEUROG3, PAX4, and PDX1 (Flannick et al., 2019; Mahajan et al., 2018; Steinthorsdottir et al., 2014), which are all transcription factors also expressed in pancreatic developmental precursors. Studies in model organisms and hESC-based models of pancreatic endocrine cell differentiation have shown that inactivation of these transcription factors causes defects in endocrine cell development, resulting in reduced beta cell numbers (Gaertner et al., 2019). Furthermore, heterozygous mutations for HNF1A, HNF1B, HNF4A, PAX4, and PDX1 are associated with maturity onset diabetes of the young (MODY), which is an autosomal dominant form of diabetes with features similar to T2D (Urakami, 2019). Thus, there is evidence that reduced activity of developmentally expressed transcription factors can cause diabetes later in life. The role of these transcription factors in T2D and MODY could be explained by their functions in regulating gene expression in mature islet cells. However, it is also possible that their function during endocrine cell development could predispose to diabetes instead of, or in addition to, endocrine cell gene regulation. One conceivable mechanism is that individuals with reduced activity of these transcription factors are born with either fewer beta cells or beta cells more prone to fail under conditions of increased insulin demand. Observations showing that disturbed intrauterine metabolic conditions, such as maternal malnutrition, can lead to reduced beta cell mass and T2D predisposition in the offspring (Lumey et al., 2015; Nielsen et al., 2014; Portha et al., 2011) support the concept that compromised beta cell development could predispose to T2D. However, whether there is T2D genetic risk relevant to the regulation of endocrine cell development independent of gene regulation in mature islet cells has not been explored.

In this study, we investigated the contribution of gene regulatory programs specifically active during pancreatic development to T2D risk. First, we employed a hESC-based differentiation system to generate chromatin maps of hESCs during their stepwise differentiation into pancreatic progenitor cells. We then identified T2D-associated variants localized in active enhancers in developmental precursors but not in mature islets, used genome editing in hESCs to define target genes of pancreatic progenitor-specific enhancers harboring T2D variants, and employed zebrafish genetic models to study the role of two target genes in pancreatic and endocrine cell development.

RESULTS Pancreatic progenitor stretch enhancers are enriched for T2D risk variants

To determine whether there is a development-specific genetic contribution to T2D risk, we generated genome-wide chromatin maps of hESCs during their stepwise differentiation into pancreatic progenitors through four distinct developmental stages: definitive endoderm (DE), gut tube (GT), early pancreatic progenitors (PP1), and late pancreatic progenitors (PP2) (Figure 1A). We then used ChromHMM (Ernst & Kellis, 2012) to annotate chromatin states, such as active promoters and enhancers, at all stages of hESC differentiation as well as in primary islets (Figure 1 – figure supplement 1A,B).

Large, contiguous regions of active enhancer chromatin, which have been termed stretchor super-enhancers (Parker et al., 2013; Whyte et al., 2013), are highly enriched for T2D risk variants in islets (Parker et al., 2013; Pasquali et al., 2014). We therefore partitioned active enhancers from each hESC developmental stage and islets into stretch enhancers (SE) and traditional (non-stretch) enhancers (TE) (Figure 1B). Consistent with prior observations of SE features (Parker et al., 2013; Whyte et al., 2013), SE comprised a small subset of all active enhancers (7.7%, 7.8%, 8.8%, 8.1%, 8.1%, and 10.4% of active enhancers in ES, DE, GT, PP1, PP2, and islets, respectively; Figure 1B and Figure 1 – genes proximal to TE (p = 4.68 × 10-7, 4.64 × 10-11, 1.31 × 10-5, 8.85 × 10-9, 5.34 × 10-6, and < 2.2 × 10-16 for expression of genes near TE vs SE in ES, DE, GT, PP1, PP2, and islets, respectively; Figure 1 – figure supplement 1D). Genes near SE in pancreatic progenitors included transcription factors involved in the regulation of pancreatic cell identity, such as NKX6.1 and PDX1 (Figure 1C). Since disease-associated variants are preferentially enriched in narrow peaks of accessible chromatin within broader regions of active chromatin (Greenwald et al., 2019; Thurner et al., 2018; Varshney et al., 2017), we next used ATAC-seq to generate genome-wide maps of chromatin accessibility across all time points of differentiation. Nearly all identified SE contained at least one ATAC-seq peak (Figure 1D and Figure 1 – figure supplement 1E,F). At the PP2 stage, 62.3% of SE harbored one, 32.2% two or three, and 0.7% four or more ATAC-seq peaks (Figure 1 – figure supplement 1F). Similar percentages were observed in earlier developmental precursors and islets.

Having annotated accessible chromatin sites within SE, we next tested for enrichment of T2D-associated variants in SE active in mature islets and in pancreatic developmental stages. We observed strongest enrichment of T2D-associated variants in islet SE (log enrichment = 2.18, 95%

CI = 1.80, 2.54) and late pancreatic progenitor SE (log enrichment = 2.17, 95% CI = 1.40, 2.74), which was more pronounced when only considering variants in accessible chromatin sites within these elements (islet log enrichment = 3.20, 95% CI = 2.74, 3.60; PP2 log enrichment = 3.18, 95% CI = 2.35, 3.79; next determined whether pancreatic progenitor SE contribute to T2D risk independently of islet SE. Variants in accessible chromatin sites of late pancreatic progenitor SE were enriched for T2D association in a joint model including islet SE (islet log enrichment = 2.94, 95% CI = 2.47, 3.35; PP2 log enrichment = 1.27, 95% CI = 0.24, 2.00; Figure 1F). We also observed enrichment of variants in accessible chromatin sites of pancreatic progenitor SE after conditioning on islet SE (log enrichment = 0.60, 95% CI = -0.87, 1.48), as well as when excluding pancreatic progenitor SE active in islets (log enrichment = 1.62, 95% CI = <-20, 3.14). Examples of known T2D loci with T2D-associated variants in SE active in pancreatic progenitors but not in islets included LAMA1 and PROX1. These results suggest that a subset of T2D variants may affect disease risk by altering regulatory programs specifically active in pancreatic progenitors.

Pancreatic progenitor-specific stretch enhancers are near genes that regulate tissue morphogenesis

Having observed enrichment of T2D risk variants in pancreatic progenitor SE independent of islet SE, we next sought to further characterize the regulatory programs of SE with specific function in pancreatic progenitors. We therefore defined a set of pancreatic progenitor-specific stretch enhancers (PSSE) based on the following criteria: (i) annotation as a SE at the PP2 stage, (ii) no classification as a SE at the ES, DE, and GT stages, and (iii) no classification as a TE or SE in islets. Applying these criteria, we identified a total of 492 PSSE genome-wide (Figure 2A and Figure 2 – source data 1).

As expected based on their chromatin state classification, PSSE acquired broad deposition of the active enhancer mark H3K27ac at the PP1 and PP2 stages (Figure 2B,C). Coincident with an increase in H3K27ac signal, chromatin accessibility at PSSE also increased (Figure 2B), and 93.5% of PSSE contained at least one accessible chromatin site at the PP2 stage (Figure 2 – figure supplement 1A,B). Further investigation of PSSE chromatin state dynamics at earlier stages of pancreatic differentiation revealed that PSSE were often poised (defined by H3K4me1 in the absence of H3K27ac) prior to activation (42%, 48%, 63%, and 17% of PSSE in ES, DE, GT, and PP1, respectively; Figure 2C), consistent with earlier observations that a poised enhancer state frequently precedes enhancer activation during development (RadaIglesias et al., 2011; Wang et al., 2015) . Intriguingly, a subset of PSSE was classified as TE earlier in development (13%, 23%, 29%, and 46% of PSSE in ES, DE, GT, and PP1, respectively; Figure 2C), suggesting that SE emerge from smaller regions of active chromatin seeded at prior stages of development. During differentiation into mature islet cells, PSSE lost H3K27ac but largely retained H3K4me1 signal (62% of PSSE) (Figure 2C), persisting in a poised state in terminally differentiated islet cells.

To gain insight into the transcription factors that regulate PSSE, we conducted motif enrichment analysis of accessible chromatin sites within PSSE (Figure 2 – figure supplement 1C). Consistent with the activation of PSSE upon pancreas induction, motifs associated with transcription factors known to regulate pancreatic development (Conrad et al., 2014; Masui et al., 2007) were enriched, including FOXA (p = 1 × 10-34), PDX1 (p = 1 × 10-30), GATA (p = 1 × 10-25), ONECUT (p = 1 × 10-17), and RBPJ (p = 1 × 10-14), suggesting that pancreatic lineage-determining transcription factors activate PSSE.

Analysis of the extent of PSSE overlap with ChIP-seq binding sites for FOXA1, FOXA2, GATA4, GATA6, PDX1, HNF6, and SOX9 at the PP2 stage substantiated this prediction (p < 1 × 10-4 for all transcription factors; permutation test; Figure 2D).

Annotation of biological functions of predicted target genes for PSSE (nearest gene with FPKM ≥ 1 at PP2 stage) revealed gene ontology terms related to developmental processes, such as tissue morphogenesis (p = 1 × 10-7) and vascular development (p = 1 × 10-8), as well as developmental signaling pathways, including BMP (p = 1 × 10-5), NOTCH (p = 1 × 10-4), and canonical Wnt signaling (p = 1 × 10-4; Figure 2E and Figure 2 – source data 2), which have demonstrated roles in pancreas morphogenesis and cell lineage allocation (Ahnfelt-Ronne et al., 2010; Li et al., 2015; Murtaugh, 2008; Sharon et al., 2019; Sui et al., 2013) . Consistent with the temporal pattern of H3K27ac deposition at PSSE, transcript levels of PSSE-associated genes increased upon pancreatic lineage induction and peaked at the PP2 stage (p = 1.8 × 10-8; Figure 2 – figure supplement 1D). Notably, expression of these genes sharply decreased in islets (p < 2.2 × 10-16), underscoring the likely role of these genes in regulating pancreatic development but not mature islet function.

Pancreatic progenitor-specific stretch enhancers are highly specific across T2Drelevant tissues and cell types

We next sought to understand the phenotypic consequences of PSSE activity in the context of T2D pathophysiology. Variants in accessible chromatin sites of PSSE genomewide were enriched for T2D association (log enrichment = 2.85, 95% CI = <-20, 4.09). We determined enrichment of genetic variants for T2D-related quantitative endophenotypes within accessible chromatin sites of PSSE, as well as all pancreatic progenitor SE (not just progenitor-specific) and islet SE, using LD score regression (BulikSullivan et al., 2015; Finucane et al., 2015) . As expected based on prior observations (Parker et al., 2013; Pasquali et al., 2014), we observed enrichment (Z > 1.96) of variants associated with quantitative traits related to insulin secretion and beta cell function within islet SE, exemplified by fasting proinsulin levels, HOMA-B, and acute insulin response (Z = 2.8, Z = 2.6, and Z = 2.2, respectively; Figure 2F). Conversely, PSSE showed a trend toward depletion for these traits although the estimates were not significant. We further tested for enrichment in the proportion of variants in PSSE and islet SE nominally associated (p < 0.05) with beta cell function traits compared to background variants. There was significant enrichment of beta cell trait association among islet SE variants (χ2 test; p < 0.05 for all beta cell functional traits except for insulin secretion rate), but no corresponding enrichment for PSSE (Figure 2 – source data 3).

A prior study found that variants at the LAMA1 locus had stronger effects on T2D risk among lean relative to obese cases (Perry et al., 2012). Since we identified a PSSE at the LAMA1 locus, we postulated that variants in PSSE collectively might have differing impact on T2D risk in cases segregated by BMI. We therefore tested PSSE, as well as pancreatic progenitor SE and islet SE, for enrichment of T2D association using GWAS of lean and obese T2D (Perry et al., 2012), using LD score regression (Bulik-Sullivan et al., 2015; Finucane et al., 2015) . We observed nominally significant enrichment of variants in pancreatic progenitor SE for T2D among lean cases (Z = 2.1). Variants in PSSE were mildly enriched for T2D among lean (Z = 1.1) and depleted among obese (Z = -0.70) cases, although neither estimate was significant. By comparison, islet SE showed positive enrichment for T2D among both lean (Z = 1.9) and obese cases (Z = 1.3; Figure 2F). Together, these results suggest that PSSE may affect T2D risk in a manner distinct from islet SE function.

Having observed little evidence for enrichment of PSSE variants for traits related to beta cell function, we asked whether the enrichment of PSSE for T2D-associated variants could be explained by PSSE activity in T2D-relevant tissues and cell types outside the pancreas. We assessed PSSE activity by measuring H3K27ac signal in 95 representative tissues and cell lines from the ENCODE and Epigenome Roadmap projects (Roadmap Epigenomics et al., 2015) . Interestingly, there was group-wide specificity of PSSE to pancreatic progenitors relative to other cells and tissues including those relevant to T2D, such as adipose tissue, skeletal muscle, and liver (Figure 2 – figure supplement 1E and Figure 2 – source data 4). Since gene regulation in adipocyte precursors also contributes to T2D risk (Claussnitzer et al., 2014), we further examined PSSE specificity with respect to chromatin states during adipogenesis, using data from human adipose stromal cell differentiation stages (hASC1-4) (Mikkelsen et al., 2010; Varshney et al., 2017). PSSE exhibited virtually no active chromatin during adipogenesis (9, 8, 6, and 8 out of the 492 PSSE were active enhancers in hACS-1, hASC-2, hASC-3, and hASC-4, respectively; Figure 2 – figure supplement 1F). These findings identify PSSE as highly pancreatic progenitor-specific across T2D-relevant tissues and cell types.

Identification of pancreatic progenitor-specific stretch enhancers harboring T2Dassociated variants

Given the relative specificity of PSSE to pancreatic progenitors, we next sought to identify T2D-associated variants in

PSSE at specific loci which may affect pancreatic development. We therefore identified variants in PSSE with evidence of T2D association (at p = 4.7 × 10-6) after correcting for the total number of variants in PSSE genome-wide (n = 10,738). In total there were 49 variants in PSSE with T2D association exceeding this threshold mapping to 11 loci (Figure 3A). This included variants at 9 loci with known genome-wide significant T2D association (PROX1, ST6GAL1, SMARCAD1, XKR6, INSIGF2, HMGA2, SMEK1, HMG20A, and LAMA1), as well as at two previously unreported loci with sub-genome-wide significant association, CRB2 and PGM1. To identify candidate target genes of the T2D-associated PSSE in pancreatic progenitors, we analyzed the expression of all genes within the same topologically associated domain (TAD) as the PSSE in PP2 cells and in primary human embryonic pancreas tissue (Figure 3B and Figure 3 – figure supplement 1A). These expressed genes are candidate effector transcripts of T2D-associated variants in pancreatic progenitors. As many pancreatic progenitor SE remain poised in mature islets (Figure 2C), we considered whether T2D-associated variants in PSSE could have gene regulatory function in islets that is re-activated in the disease state. We therefore assessed overlap of PSSE variants with accessible chromatin of islets from T2D donors (Khetan et al., 2018). None of the strongly T2D-associated variants in PSSE (p = 4.7 × 10-6) overlapped an islet accessible chromatin site in T2D islets, arguing against the relevance of PSSE in broadly regulating islet gene activity during T2D.

A pancreatic progenitor-specific stretch enhancer at LAMA1 harbors T2D risk variants and regulates LAMA1 expression selectively in pancreatic progenitors

Variants in a PSSE at the LAMA1 locus were associated with T2D at genome-wide significance (Figure 3A), and LAMA1 was highly expressed in the human embryonic pancreas (Figure 3B). Furthermore, the activity of the PSSE at the LAMA1 locus was almost exclusively restricted to pancreatic progenitors (Figure 3 – figure supplement 1B,C), and was further among the most progenitor-specific across all PSSE harboring T2D risk variants (Figure 3C). In addition, reporter gene assays in zebrafish embryos have shown that this enhancer drives gene expression specific to pancreatic progenitors in vivo (Cebola et al., 2015) . We therefore postulated that the activity of T2D-associated variants within the LAMA1 PSSE is relevant for gene regulation in pancreatic progenitors, and we sought to characterize the LAMA1 PSSE in greater depth.

Multiple T2D-associated variants mapped within the LAMA1 PSSE, and these variants were further in the 99% credible set in fine-mapping data from the DIAMANTE consortium (Mahajan et al., 2018) (Figure 4A). No other variants in the 99% credible set mapped in an accessible chromatin site active in islets from either non-diabetic or T2D samples. The PSSE is intronic to the LAMA1 gene and contains regions of poised chromatin and TE at prior developmental stages (Figure 4A). Consistent with its stepwise genesis as a SE throughout development, regions of open chromatin within the LAMA1 PSSE were already present at the DE and GT stages. Furthermore, pancreatic lineage-determining transcription factors, such as FOXA1, FOXA2, GATA4, GATA6, HNF6, SOX9, and PDX1, were all bound to the PSSE at the PP2 stage (Figure 4B). Among credible set variants in the LAMA1 PSSE, rs10502347 overlapped an ATAC-seq peak as well as ChIP-seq sites for multiple pancreatic lineage-determining transcription factors. Additionally, rs10502347 directly coincided with a SOX9 footprint identified in ATAC-seq data from PP2 cells, and the T2D risk allele C is predicted to disrupt SOX9 binding (Figure 4B). Consistent with the collective endophenotype association patterns of PSSE (Figure 2F), rs10502347 showed no association with beta cell function (p = 0.81, 0.23, 0.46 for fasting proinsulin levels, HOMA-B, and acute insulin response, respectively; Figure 4 – figure supplement 1A). Thus, T2D variant rs10502347 is predicted to affect the binding of pancreatic transcription factors and does not appear to affect beta cell function. Enhancers can control gene expression over large genomic distances, and therefore their target genes cannot be predicted based on proximity alone. To directly assess the function of the LAMA1 PSSE in regulating gene activity, we utilized CRIPSR-Cas9mediated genome editing to generate two independent clonal human hESC lines harboring homozygous deletions of the LAMA1 PSSE (hereafter referred to as ∆LAMA1Enh; Figure 4 – figure supplement 1B). We examined LAMA1 expression in ∆LAMA1Enh compared to control cells throughout stages of pancreatic differentiation. Consistent with the broad expression of LAMA1 across developmental and mature tissues, control cells expressed LAMA1 at all stages (Figure 4C). LAMA1 was expressed at similar levels in ∆LAMA1Enh and control cells at early developmental stages, but was significantly reduced in PP2 cells derived from ∆LAMA1Enh clones (p = 0.319, 0.594, 0.945, 0.290, and < 1 × 10-6 for comparisons in ES, DE, GT, PP1, and PP2, respectively; Figure 4D). To next investigate whether the LAMA1 PSSE regulates other genes at this locus, we examined expression of genes mapping in the same TAD. ARHGAP28 was the only other expressed gene within the TAD, and albeit not significantly different from controls (p.adj > 0.05), showed a trend toward lower expression in ∆LAMA1Enh PP2 cells (Figure 4E), raising the possibility that ARHGAP28 is an additional target gene of the LAMA1 PSSE. Together, these results demonstrate that while LAMA1 itself is broadly expressed across developmental stages, the T2D-associated PSSE regulates LAMA1 expression specifically in pancreatic progenitors.

To determine whether deletion of the LAMA1 PSSE affects pancreatic development, we generated PP2 stage cells from ∆LAMA1Enh and control hESC lines and analyzed pancreatic cell fate commitment by flow cytometry and immunofluorescence staining for PDX1 and NKX6.1 (Figure 4 – figure supplement 1C,D). At the PP2 stage, ∆LAMA1Enh and control cultures contained similar percentages of PDX1- and NKX6.1-positive cells. Furthermore, mRNA expression of PDX1, NKX6.1, PROX1, PTF1A, and SOX9 was either unaffected or only minimally reduced (p adj. = 3.56 × 10-2, 0.224, 0.829, 8.14 × 10-2, and 0.142, for comparisons of PDX1, NKX6.1, PROX1, PTF1A, and SOX9 expression, respectively; Figure 4 – figure supplement 1E), and the overall gene expression profiles were similar in ∆LAMA1Enh and control PP2 cells (Figure 4 – figure supplement 1F and Figure 4 – source data 1,2). These findings indicate that in vitro pancreatic lineage induction is unperturbed in ∆LAMA1Enh cells exhibiting reduced LAMA1 expression.

Pancreatic progenitor-specific stretch enhancers at the CRB2 and PGM1 loci

harbor T2D-associated variants Multiple variants with evidence for T2D association in PSSE mapped outside of known risk loci, such as those mapping to CRB2 and PGM1 (Figure 3A). As with the LAMA1 PSSE, PSSE harboring variants at CRB2 and PGM1 were intronic to their respective genes, harbored ATAC-seq peaks, and bound pancreatic lineage-determining transcription factors FOXA1, FOXA2, GATA4, GATA6, HNF6, SOX9, and PDX1 (Figure

5A,B and Figure 5 – figure supplement 1A,B). Compared to the LAMA1 PSSE, CRB2

and PGM1 PSSE were less specific to pancreatic progenitors and exhibited significant H3K27ac signal in several other tissues and cell types, most notably brain, liver, and the digestive tract (Figure 5 – figure supplement 1C,D).

CRB2 is a component of the Crumbs protein complex involved in the regulation of cell polarity and neuronal, heart, retinal, and kidney development (Alves et al., 2013; Bulgakova & Knust, 2009; Dudok et al., 2016; Jimenez-Amilburu & Stainier, 2019; Slavotinek et al., 2015) . However, its role in pancreatic development is unknown. To determine whether the CRB2 PSSE regulates CRB2 expression in pancreatic progenitors, we generated two independent hESC clones with homozygous deletions of the CRB2 PSSE (hereafter referred to as ∆CRB2Enh; Figure 5 – figure supplement 2A) and performed pancreatic differentiation of ∆CRB2Enh and control hESC lines. In control cells, CRB2 was first expressed at the GT stage and increased markedly at the PP1 stage (Figure 5C). This pattern of CRB2 expression is consistent with H3K27ac deposition at the CRB2 PSSE in GT stage cells and classification as a SE at the PP1 and PP2 stages (Figure 5A and Figure 5 – figure supplement 1C). In ∆CRB2Enh cells, we observed upregulation of CRB2 expression at earlier developmental stages, in particular at the DE and GT stages (p < 1 × 10-6 at both stages; Figure 5D), suggesting that the CRB2 PSSE may be associated with repressive transcriptional complexes prior to pancreas induction. At the PP2 stage, CRB2 expression was significantly reduced in ∆CRB2Enh cells (p adj. = 3.51 × 10-3; Figure 5D), whereas the expression of other genes in the same TAD was not affected (p adj. ≥ 0.05; Figure 5E). Thus, the CRB2 PSSE specifically regulates CRB2 and is required for CRB2 expression in pancreatic progenitors.

Phenotypic characterization of PP2 stage ∆CRB2Enh cells revealed similar percentages of PDX1- and NKX6.1-positive cells as in control cells (Figure 5 – figure supplement 2B,C). The expression of pancreatic transcription factors and global gene expression profiles were also similar in ∆CRB2Enh and control PP2 cells (Figure 5 – figure supplement 2D,E and Figure 5 – source data 1). Thus, similar to LAMA1 PSSE deletion, CRB2 PSSE deletion does not overtly impair pancreatic lineage induction in the in vitro hESC differentiation system. lama1 and crb2 zebrafish morphants display annular pancreas and decreased beta cell mass Based on their classification as extracellular matrix and cell polarity proteins, respectively, laminin (encoded by LAMA1) and CRB2 are predicted to regulate processes related to tissue morphogenesis, such as cell migration, tissue growth, and cell allocation within the developing organ. Furthermore, PSSE in general were enriched for proximity to genes involved in tissue morphogenesis (Figure 2E), suggesting that T2D risk variants acting within PSSE could have roles in pancreas morphogenesis. Since cell migratory processes and niche-specific signaling events are not fully modeled during hESC differentiation, we reasoned that the in vitro pancreatic differentiation system might not be suitable for studying laminin and CRB2 function in pancreatic development.

To circumvent these limitations, we employed zebrafish as an in vivo vertebrate model to study the effects of reduced lama1 and crb2 levels on pancreatic development. The basic organization and cell types in the pancreas as well as the genes regulating endocrine and exocrine pancreas development are highly conserved between zebrafish and mammals expression of laminin and Crb proteins, we used TgBAC(pdx1:eGFP)bns13 embryos, in which eGFP marks the duodenum and developing pancreas, consistent with mammalian Pdx1 expression (Figure 6 – Figure supplement 1A). At 48 hours post-fertilization (hpf), laminin was detected adjacent to pancreatic cells expressing pdx1:eGFP (Figure 6 – Figure supplement 1A, yellow arrow), whereas Crb was localized in pdx1:eGFP pancreatic cells (Figure 6 – Figure supplement 1B). Within the developing foregut region, Crb expression was exclusive to the pancreatic anlage (Figure 6 – Figure supplement 1B).

To determine the respective functions of lama1 and crb2 in pancreatic development, we performed knockdown experiments using anti-sense morpholinos directed against lama1 and the two zebrafish crb2 genes, crb2a and crb2b (Omori & Malicki, 2006; Pollard et al., 2006). We determined effects of these knockdowns in the Tg(ptf1a:eGFP)jh1 embryos to visualize the acinar pancreas, which comprises the majority of the organ. Consistent with prior studies (Pollard et al., 2006), lama1 morphants exhibited reduced body size and other gross anatomical defects at 78 hpf, whereas crb2a/b morphants appeared grossly normal. Both lama1 and crb2a/b morphants displayed an annular pancreas (15 out of 34 lama1 and 27 out of 69 crb2a/b morphants) characterized by pancreatic tissue partially or completely encircling the duodenum (Figure 6A-D), a phenotype indicative of impaired migration of pancreatic progenitors during pancreas formation. These findings suggest that both lama1 and crb2a/b control cell migratory processes during early pancreatic development and that reduced levels of lama1 or crb2a/b impair pancreas morphogenesis.

To gain insight into the effects of lama1 and crb2a/b knockdown on pancreatic endocrine cell development, we examined beta cell numbers (insulin+ cells) at 78 hpf. We also evaluated potential synergistic effects of combined lama1 and crb2a/b knockdown. To account for the reduction in body and pancreas size in lama1 morphants, we compared cell numbers in 78 hpf lama1 morphants with 50 hpf control embryos, which have a similarly sized acinar compartment as 78 hpf lama1 morphants. Beta cell numbers were significantly reduced in both lama1 and crb2a/b morphants (p = 8.0 × 10-3 and 4.0 × 10-3 for comparisons of lama1 and crb2a/b morphants, respectively; Figure 6E,F), showing that reduced lama1 and crb2a/b levels impair beta cell development. Although not significant, morphants with a combined knockdown of lama1 and crb2a/b had a trend toward lower beta cell numbers than individual morphants, suggestive of additive effects (p = 0.42; Figure 6F). Furthermore, we found that nearly all lama1, crb2a/b, and combined lama1 and crb2a/b morphants without an annular pancreas had reduced beta cell numbers, indicating independent roles of lama1 and crb2 in pancreas morphogenesis and beta cell differentiation. Finally, to investigate the contributions of individual crb2 genes to the observed phenotype, we performed knockdown experiments using morpholinos against crb2a and crb2b alone. Only crb2b morphants showed a significant reduction in beta cell numbers (p = 4.4 × 10-2; Figure 6 – Figure supplement 2), suggesting that crb2b is the predominant crb2 gene required for beta cell development. Combined, these findings demonstrate that lama1 and crb2 are regulators of pancreas morphogenesis and beta cell development in vivo.

DISCUSSION

In this study, we identify T2D-associated variants localized within chromatin active in pancreatic progenitors but not islets or other T2D-relevant tissues, suggesting a novel mechanism

whereby a subset of T2D risk variants specifically alters pancreatic developmental processes. We link T2D-associated enhancers active in pancreatic progenitors to the regulation of LAMA1 and CRB2 and demonstrate a functional requirement in zebrafish for lama1 and crb2 in pancreas morphogenesis and endocrine cell formation. Furthermore, we provide a curated list of T2D risk-associated enhancers and candidate effector genes for further exploration of how the regulation of developmental processes in the pancreas can predispose to T2D.

Our analysis identified eleven loci where T2D-associated variants mapped in SE specifically active in pancreatic progenitors. Among these loci was LAMA1, which has stronger effects on T2D risk in lean compared to obese individuals (Perry et al., 2012). We also found evidence that variants in PSSE collectively have stronger enrichment for T2D in lean individuals, although the small number of PSSE and limited sample size of the BMI-stratified T2D genetic data prohibits a more robust comparison. There was also a notable lack of enrichment among PSSE variants for association with traits related to insulin secretion and beta cell function. If T2D-associated variants in PSSE indeed confer diabetes susceptibility by affecting beta cell development, the question arises as to why variants associated with traits related to beta cell function are not enriched within PSSE. As genetic association studies of endophenotypes are based on data from non-diabetic subjects, a possible explanation is that variants affecting beta cell developmental and cell lines from the ENCODE and Epigenome Roadmap projects. See also Figure 3 – figure supplement 1. and islets (ISL) is shown. (C) H3K27ac signal at LAMA1-associated PSSE in tissues and cell lines from the ENCODE and Epigenome Roadmap projects as well as in developmental intermediates and islets.

GATA6, HNF6, SOX9, and PDX1 ChIP-seq profiles at the PGM1 PSSE in PP2 cells. The variants rs2269247, rs2301055, rs2301054, and rs2269246 (black) overlap with transcription factor binding sites. (C) H3K27ac signal at CRB2-associated PSSE in tissues and cell lines from the ENCODE and Epigenome Roadmap projects as well as in developmental intermediates and islets (ISL). (D) H3K27ac signal at PGM1associated PSSE in tissues and cell lines from the ENCODE and Epigenome Roadmap projects as well as in developmental intermediates and islets.

SOURCE DATA Figure 2 – source data 1: Chromosomal coordinates of pancreatic progenitor-specific stretch enhancers (PSSE).

Figure 2 – source data 2: Enriched gene ontology terms for PSSE-associated genes. Figure 2 – source data 3: Proportion of variants nominally associated with beta cell

Figure 2 – source data 4: Tissue identity of downloaded data from ROADMAP Figure 4 – source data 1: Genes downregulated in ∆LAMA1Enh PP2 stage cells compared to control cells (p adj. < .05).

Figure 4 – source data 2: Genes upregulated in ∆LAMA1Enh PP2 stage cells compared to control cells (p adj. < .05).

Figure 5 – source data 1: Genes downregulated in ∆CRB2Enh PP2 stage cells compared to control cells (p adj. < .05).

(Dong et al., 2008 ; Field et al., 2003 ; Kimmel et al., 2015 ). To analyze pancreatic