December#GotGlycans: Role of N343 Glycosylation on the SARS-CoV-2 S RBD Structure and Co- Receptor Binding Across Variants of ConcernCallum M. IvesDepartment of Chemistry, Maynooth University
,
Maynooth
,
IrelandDepartment of Chemistry, University of Alberta
,
Edmonton, Alberta
,
CanadaDeÌpartement de Biochimie et MeÌdecine MoleÌculaire, UniversiteÌ de MontreÌal
,
QueÌbec
,
CanadaHamilton Institute, Maynooth University
,
Maynooth
,
IrelandHuman Health Therapeutics Research Centre, Life Sciences Division, National Research Council Canada
,
QueÌbec
,
Canada202352023
Glycosylation of the SARS-CoV-2 spike (S) protein represents a key target for viral evolution because it affects both viral evasion and fitness. Successful variations in the glycan shield are difficult to achieve though, as protein glycosylation is also critical to folding and to structural stability. Within this framework, the identification of glycosylation sites that are structurally dispensable can provide insight into the evolutionary mechanisms of the shield and inform immune surveillance. In this work we show through over 45 μs of cumulative sampling from conventional and enhanced molecular dynamics (MD) simulations, how the structure of the immunodominant S receptor binding domain (RBD) is regulated by N-glycosylation at N343 and how this glycan9s structural role changes from WHu-1, alpha (B.1.1.7), and beta (B.1.351), to the delta (B.1.617.2) and omicron (BA.1 and BA.2.86) variants. More specifically, we find that the amphipathic nature of the N-glycan is instrumental to preserve the structural integrity of the RBD hydrophobic core and that loss of glycosylation at N343 triggers a specific and consistent conformational change. We show how this change allosterically regulates the conformation of the receptor binding motif (RBM) in the WHu-1, alpha and beta RBDs, but not in the delta and omicron variants, due to mutations that reinforce the RBD architecture. In support of these findings, we show that the binding of the RBD to monosialylated ganglioside co-receptors is highly dependent on N343 glycosylation in the WHu-1, but not in the delta RBD, and that affinity changes significantly across VoCs. Ultimately, the molecular and functional insight we provide in this work reinforces our understanding of the role of glycosylation in protein structure and function and it also allows us to identify the structural constraints within which the glycosylation site at N343 can become a hotspot for mutations in the SARS-CoV-2 S glycan shield.
Introduction
The SARS-CoV-2 spike (S) glycoprotein is responsible for viral fusion with the host cell, initiating
an infection that leads to COVID-19
(Walls et al., 2020; Wrapp et al., 2020)
. S is a homotrimer with
a structure subdivided in two topological domains, namely S1 and S2, see Figure 1a, separated
by a furin site, which is cleaved in the pre-fusion architecture
(Walls et al., 2020; Wrapp et al.,
2020)
. In the Wuhan-Hu-1 strain (WHu-1), and still in most variants of concern (VoCs), host cell
fusion is predominantly triggered by S binding to the Angiotensin-Converting Enzyme 2 (ACE2)
receptor located on the host cell surface
(Jackson et al., 2022; Wrapp et al., 2020)
. This process
is supported by glycan co-receptors, such as heparan sulfate (HS) in the extracellular
matrix
(Clausen et al., 2020; Kearns et al., 2022)
and by monosialylated gangliosides
oligosaccharides (GM1os and GM2os) peeking from the surface of the host cells
(Nguyen et al.,
2021)
. The interaction with ACE2 requires a dramatic conformational change of the S, known as
8opening9, where one or more Receptor Binding Domains (RBDs) in the S1 subdomain become
exposed. The region of the RBD in direct contact with the ACE2 surface is known as receptor
binding motif (RBM)
(Jackson et al., 2022; Lan et al., 2020; Yi et al., 2020)
. Ultimately, S binding
to ACE2 causes shedding of the S1 subdomain and the transition to a post-fusion conformation,
which exposes the fusion peptide near the host cell surface, leading to viral entry
(Dodero-Rojas
et al., 2021; Jackson et al., 2022)
.
To exert its functions, S sticks out from the viral envelope where it is exposed to recognition. To
evade the host immune system, enveloped viruses hijack the host cell9s glycosylation machinery
to cover S with a dense coat of host carbohydrates, known as a glycan shield
(Casalino et al.,
2020; Chawla et al., 2022; Grant et al., 2020; Turoňová et al., 2020; Watanabe et al., 2020b,
2020a, 2019)
. In SARS-CoV-2 the glycan shield screens effectively over 60% of the S protein
surface
(Casalino et al., 2020)
, leaving the RBD, when open, and regions of the N-terminal domain
(NTD) vulnerable to immune recognition
(Bangaru et al., 2022; Carabelli et al., 2023; Chawla et
al., 2022; Chen et al., 2022; Harvey et al., 2021; Piccoli et al., 2020)
. The RBD targeted by
approximately 90% of serum neutralising antibodies
(Piccoli et al., 2020)
and thus a highly
effective model not only to screen antibody specificity
(Du et al., 2020; Lan et al., 2020; Lin et al.,
2022)
and interactions with host cell co-receptors
(Clausen et al., 2020; Mycroft-West et al., 2020;
Nguyen et al., 2021)
, but also as a protein scaffold for COVID-19 vaccines
(Dickey et al., 2022;
Kleanthous et al., 2021; Montgomerie et al., 2023; Ochoa-Azze et al., 2022; Tai et al., 2020;
Valdes-Balbin et al., 2021; Yang et al., 2022)
.
As a direct consequence the RBD is under great evolutionary pressure. Mutations of the RBD
leading to immune escape are particularly concerning
(Cao et al., 2023; Starr et al., 2021)
,
especially when such changes enhance the binding affinity for ACE2 or give access to alternative
entry routes
(Baggen et al., 2023; Cervantes et al., 2023)
. The identification of mutational
hotspots
(Cao et al., 2023)
and the effects of mutations in and around the RBM have been and
are under a great deal of scrutiny
(Barton et al., 2021; Bloom et al., 2023; Cao et al., 2023;
Dadonaite et al., 2022; Greaney et al., 2021; Starr et al., 2022a, 2022b, 2021)
. Yet, less attention
is devoted to mutations in the glycan shield, which have been shown to lead to dramatic changes
in infectivity
(Harbison et al., 2022; Kang et al., 2021; Zhang et al., 2022)
and in immune
escape
(Newby et al., 2022; Pegg et al., 2023)
. Successful changes in the glycan shield are
evolutionarily difficult to achieve, due to the fact that the nature and pattern of glycosylation of the
S is crucial not only to the efficiency of viral entry and evasion, but also to facilitate folding and to
preserve the structural integrity of the functional fold. Therefore, identifying potential evolutionary
hotspots in the S shield is a complex matter, yet of crucial importance to immune surveillance.
Potential changes of the shield, e.g. loss, shift or gain of new glycosylation sites, can likely occur
only where these do not negatively impact the integrity of the underlying, functional protein
architecture. In this work we present and discuss the case of N343, a key glycosylation site on
the RBD. Results of extensive sampling from molecular dynamics (MD) simulations exceeding 45
μs, show how the loss of N-glycosylation at N343 affects the structure, dynamics and co-receptor
binding of the RBD and how these effects are modulated by mutations in the underlying protein,
going from the WHu-1 strain through the VoCs designated as alpha (B.1.1.7), beta (B.1.351),
delta (B.1.617.2) and omicron (BA.1). In addition, we provide important insight into the structu re
and dynamics of the omicron BA.2.86 RBD. This variant, designated as variant under monitoring
(VUM) and commonly referred to as 8pirola9, carries a newly gained N-glycosylation site at N354,
which represents the first change in the RBD shielding since the ancestral strain.
The SARS-CoV-2 S RBD (aa 327-540) from the WHu-1 (China, 2019) to the EG.5.1 (China, 2023)
shows two highly conserved N-glycosylation sites
(Harbison et al., 2022)
, one at N331 and the
other at N343
(Watanabe et al., 2020a)
. While glycosylation at N331 is located on a highly flexible
region linking the RBD to the NTD, the N343 glycan covers a large portion of the RBD
(Casalino
et al., 2020; Harbison et al., 2022)
, stretching across the protein surface and forming a bridge
connecting the two helical regions that frame the beta sheet core, see Figure 1b. In this work we
show that removal of the N343 glycan induces a conformational change which in WHu -1, alpha
and beta allosterically controls the structure and dynamics of the RBM, see Figure 1c. In delta
and omicron these effects are significantly dampened by mutations that strengthen the RBD
architecture. Further to this molecular insight, we show that enzymatic removal of the N343 glycan
affects binding of monosialylated ganglioside co-receptors
(Nguyen et al., 2021)
in the WHu-1
RBD, but not in delta. We also observe that the affinity of the RBD for GM1 GM1os and GM2
GM1os changes significantly across the VoCs, with beta and omicron exhibiting the weakest
binding.
Ultimately, the molecular insight we provide in this work adds to the ever growing evidence
supporting the role of glycosylation in protein folding and structural stability. This information is
not only central to structural biology, but also critical to the design of novel COVID-19 vaccines
that may or may not carry glycans
(Huang et al., 2022)
, as well as instrumental to our
understanding of the evolutionary mechanisms regulating the shield.
Material and Methods
Computational methods. All simulations were performed using additive, all-atom force fields,
namely the AMBER 14SB parameter set
(Maier et al., 2015)
to represent protein atoms and
counterions (200 mM of NaCl), GLYCAM06j-1
(Kirschner et al., 2008)
to represent glycans, and
TIP3P for water molecules
(Jorgensen et al., 1983)
. All production trajectories from conventional
(deterministic) MD simulations were run for a minimum of 2 μs to ensure convergence. In some
cases, we extended the simulations up to 3 μs to assess the stability of specific conformational
transitions, where deemed necessary. All Gaussian accelerated MD (GaMD)
(Miao et al., 2015; J.
Wang et al., 2021)
production trajectories were run for 2 μs. All simulations of the N343
glycosylated and non-glycosylated RBDs were started from identical 3D structures. The glycans
at N331 and N343 were rebuilt as FA2G2 (GlyTouCan-ID G00998NI) based on glycoproteomics
data
(Newby et al., 2022; Watanabe et al., 2020a)
with 3D structures from our GlycoShape
database (https://glycoshape.org, available OA end of November 2023). Further information on
the RBD structures and PDB IDs for all variants, together with details on the MD systems set-up,
equilibration protocols and total sampling times allocations are available as Supplementary
Material. Sequences for all VoCs and VUM RBDs (aa 327-540) from
https://viralzone.expasy.org/9556.
Proteins and glycans. Expression and purification of recombinant WHu-1, Alpha, Beta, Delta
and Omicron RBD (EG319RVQP…VN541F, UniProt number P0DTC2) with C-terminal FLAG
(SGDYKDDDDKG) and His tags (HHHHHHG) used in the current study were described
elsewhere
(Akache et al., 2021; Colwill et al., 2022)
. Mutations of SARS-CoV-2 RBD VOCs are
shown in Figure S1. Proteins were purified using standard immobilised metal-ion affinity
chromatography (IMAC), followed by size-exclusion chromatography on Superdex-75 to remove
dimers as decribed
(Forest-Nault et al., 2022)
. To obtain endo F3-treated WHu-1 and Delta RBD,
100 μg of each RBD was treated with endo F3 (purchased from New England Biolabs) in 1x
Glycobuffer (50 mM sodium acetate, pH 4.5) at 37 oC overnight. Each protein was dialyzed and
concentrated against 100 mM ammonium acetate (pH 7.4) using an Amicon 0.5-mL
microconcentrator (EMD Millipore) with a 10-kDa MW cutoff and stored at –80 °C until use d. The
concentrations of protein stock solutions were estimated by UV absorption (280 nM). The
oligosaccharides of GM1 and GM2, Galβ1-3GalNAcβ1-4(Neu5Acα2-3)Galβ1-4Glc (MW 998.34
Da, GM1os) and GalNAcβ1-4(Neu5Acα2-3)Galβ1-4Glc (MW 836.29 Da, GM2os), respectively,
were purchased from Elicityl SA (Crolles, France). 1 mM stock solutions of each glycan were
prepared by dissolving a known mass of glycan in ultrafiltered Milli-Q water. All stock solutions
were stored at -20 °C until needed.
ESI-MS affinity measurements. Affinities (Kd) of glycan ligands for RBD were measured by the
direct ESI-MS binding assay. The ESI-MS affinity measurements were performed in positive ion
mode on a Q Exactive Orbitrap mass spectrometer (Thermo Fisher Scientific). The capillary
temperature was 150 °C, and the S-lens RF level was 100; an automatic gain control target of
5 × 105 and maximum injection time of 100 ms were used. The resolving power was 17,500. The
instrument was equipped with a modified nanoflow ESI (nanoESI) source. NanoESI tips with an
outer diameter (o.d.) of ∼5 µm were pulled from borosilicate glass (1.2 mm o.d., 0.69 mm i.d., 10
cm length, Sutter Instruments, CA) with a P-97 micropipette puller (Sutter Instruments). A
platinum wire was inserted into the nanoESI tip, making contact with the sample solution, and a
voltage of 0.8 kV was applied. Each sample solution cont ained a given RBD (5 μM) and GM1os or
GM2os (at three different concentrations ranging from 10 to 150 μM) in ammonium acetate (100
mM, pH 7.4). Data acquisition and pre-processing was performed using the Xcalibur software
(version 4.1); ion abundances were extracted using the in-house software SWARM
(Kitov et al.,
2019)
. A brief description of the data analysis procedures used in this work is given as the
Supporting Information.
Protease digestion. 20 µg of a given purified protein (intact and endoF3-treated WT RBDs) were
dissolved in 100 μL of 8 M urea in 100 mM Tris-HCl (pH 8.0) containing 3 mM EDTA and incubated
at room temperature for 1 h. The denatured protein was then reduced with 5 μL of 500 mM
dithiothreitol (DTT; Sigma-Aldrich) at room temperature for 1 h; followed by alkylation with 12 µL
of 500 mM iodoacetamide (Sigma-Aldrich) at room temperature for 20 min in the dark. The
reaction was quenched by adding 5 µL of 250 mM DTT, and the solution buffer was exchanged
using a 10-kDa Amicon Ultra centrifugal filter. The samples were loaded onto the filter and
centrifuged at 14 000×g for 15 min. The glycoprotein solution was subsequently digested with
trypsin/chymotrypsin (substrate/enzyme (wt/wt) = 50) in 50  mM ammonium bicarbonate (pH 8.0)
for 18 h at 37 °C. The reaction was quenched by heat ina ctivation at 100 °C for 10 min. The
lyophilized sample was stored at -20 °C until LC–MS analysis.
Peptide analysis by Reverse-Phase Liquid Chromatography (RPLC)-MS/MS. The digested
samples were separated using a RPLC-MS/MS on a Vanquish UHPLC system (Thermo Fisher
Scientific) coupled with ESI-MS detector (Thermo Q Exactive Orbitrap). Peptide separation was
achieved using a Waters Acquity UPLC Peptide BEH C18 column (1.7 μm, 2.1 mm × 150 mm;
Waters). The eluents were 0.1% formic acid in water (solvent A) and 0.1% formic acid in
acetonitrile (solvent B). The separation was performed at 60 °C. The following gradient was used
for MS detection: t = 0 min, 95% solvent A (0.2  mL min–1); t = 45 min, 40% solvent A (0.2  mL min–
1); t = 55 min, 5% solvent A (0.2  mLmin–1); t = 55.1 min, 95% solvent A (0.2  mL min–1). During LC–
MS analysis, the following parameters were used: sheath gas flow rate of 10 arbitrary units (AU),
capillary temperature of 250 °C and spray voltage of 1.5 kV. The mass spectra were acquired in
positive mode with an m/z range of 200–3,000 at a resolution of 70,000. The automatic gain
control target was set at 1 × 106, and a maximum injection time of 100 ms was used. HCD mass
spectra were acquired in the data-dependent mode for the five most abundant ions with a
resolution of 17,500. Automatic gain control target, maximum injection time and isolation window
were set at 2 × 105, 200 ms and 2.0 m/z, respectively. HCD-normalized collision energy was 25%.
The data were recorded by Xcalibur (Thermo, version 4.1) and analyzed using Thermo
BioPharma Finder software.
The peptide sequences (EG319RVQP…VN541FS with C-terminal FLAG (SGDYKDDDDKG) and
His tags (HHHHHHG), UniProt number P0DTC2) were then identified using the theoretical digest
feature of the software. Carbamidomethylation and carboxymethylation at cysteine residues were
used as a fixed modification. Common mammalian N- and O-glycans were also used as variable
modifications. A precursor mass tolerance of 5 ppm was set. For quantification, the abundance
of each N-glycan at each N-glycosylation site (N331 and N343) is the sum of MS areas under the
peak curve divided by the corresponding charge states. Next, for each N-glycosylation site, the
relative abundance of each N-glycan is calculated as its abundance over the total abundance of
all N-glycans detected.
Results
In this section we start with a brief overview of the architecture of the RBD, we then explain how
the RBD structure is modulated by interactions with ACE2 and why the N343 glycan is integral to
its stability. We then describe how and why the loss of N343 glycosylation affects the RBD
structure and its binding affinity for GM1os and GM2os to different degrees in the VoCs.
SARS-CoV-2 S RBD structure and antigenicity. The SARS-CoV-2 S RBD encompasses both
structured and intrinsically disordered regions. The structured region is supported by a largely
hydrophobic beta sheet core, framed by two flanking, partially helical loops (aa 335-345 and aa
365-375), linked by a bridging N-glycan at N343, see Figure 1b. The aa 335-345 loop carries the
N343 glycosylation site and it is part of an important antigenic region targeted by Class 2 and 3
antibodies
(Bangaru et al., 2022; Barnes et al., 2020; Carabelli et al., 2023; Chen et al., 2022)
. In
the bridging conformation, the N343 glycan pentasaccharide extends across the RBD beta sheet
to reach the aa 365-375 loop forming highly populated hydrogen bonding and dispersion
interactions with the backbone and with the sidechains of residues 365 to 375, see Figures 1b,c
and S.2. The bridging N343 glycan shields the hydrophobic beta sheet core of the RBD from the
surrounding water, preventing energetically unfavourable contacts. Due to its amphipathic nature,
the N343 forms dispersion interactions with the hydrophobic residues of the beta sheet through
its core GlcNAc-β(1-4)GlcNAc, while engaging in hydrogen bonds with the surrounding water and
with the aa 365-375 helical loop. Notably, the key anchoring residues S371, S373 and S375 within
this loop are all mutated to hydrophobic residues in all omicron variants (BA.1-2, BA.4-5, BQ.1.1,
EG.5.1, XBB.1.5).
The receptor binding motif (RBM) encompasses aa 439-506 and counts all the RBD residues in
direct contact with ACE2
(Lan et al., 2020)
. The RBM is heavily targeted by both Class 1 and 2
antibodies
(Bangaru et al., 2022; Barnes et al., 2020; Carabelli et al., 2023; Chen et al., 2022)
and
under high evolutionary pressure, with all VoCs carrying mutations in this region. As shown by
earlier MD simulations studies
(Casalino et al., 2020; Harbison et al., 2022; Sztain et al., 2021;
Williams et al., 2022)
, the RBM in unbound S is largely unstructured and dynamic, an insight also
supported by the low resolution cryo-EM maps of this region
(Gobeil et al., 2022; Walls et al.,
2020; Wrapp et al., 2020)
. The RBM9s inherent flexibility is likely an important feature in the
opening and closing mechanism of the RBD, where the N343 from adjacent RBDs engage with
the protein in closed conformation and gate RBD opening
(Sztain et al., 2021)
. The only relatively
structured region of the RBM is what we define here on as the hydrophilic patch, see Figure 1b,
a hairpin stabilised by a network of interlocking salt-bridges and polar residues, namely R454,
R457, K458, K462, E465, D467, S469, and E471, that faces the interior of the S when the RBD
is closed.
When in complex with ACE2 or with antibodies, the RBM adopts a structured fold, also shared by
the SARS-CoV-1 S RBD
(Li et al., 2005)
. In this conformation only the terminal hairpin of the RBM
(aa 476 to 486) retains a high degree of flexibility, as shown in this work and by others
(Williams
et al., 2022)
. The RBM bound fold is stabilised by a hydrophobic patch supported by the stacking
of the aromatic and aliphatic residues L455, F456, Y473, A475, see Figure 1b, which are part of
the protein interface with ACE2. Notably, all residues in the hydrophobic and hydrophilic patches
are highly conserved across the VoCs, possibly due to their critical function in inducing and/or
stabilising the RBD into its ACE2-bound conformation. As an interesting observation, the loss of
stacking in the hydrophobic patch due to the recent F456L mutation in the EG.5.1 variant (China,
2023) is recovered by the L455F mutation in the, appropriately named, FLip variant.
Based on evidence from screenings
(Bangaru et al., 2022; Carabelli et al., 2023; Chen et al.,
2022)
, we subdivided the RBD into three different antigenic regions known to be targeted by
different classes of antibodies, see Figure 1d. Region 1 stretches from aa 337-353, which
includes the N343 glycosylation site, and counts residues targeted by class 2 and 3
antibodies
(Bangaru et al., 2022; Carabelli et al., 2023; Harvey et al., 2021)
. The aa sequence in
Region 1 has been highly conserved so far, allowing specific antibodies to retain their
neutralisation activity across all VoCs, such as S309
(Piccoli et al., 2020; Pinto et al., 2020)
, whose
binding mode also directly involves the N343 glycan
(Liu et al., 2021)
. A notable and dramatic
exception to this high degree of conservation in Region 1 is given by the BA.2.86 variant
(Denmark, 2023), known as 8pirola9, where the K356T mutation introduces a new N-glycosylation
sequon at N354. Region 2 coincides with the RBM, which, in addition to binding ACE2 and
neutralising antibodies, used to bind the N370 glycan from adjacent RBDs
(Allen et al., 2023;
Harbison et al., 2022; Watanabe et al., 2020b)
. N370 glycosylation is lost in SARS-CoV-2 with
the RBM binding cleft available to bind glycan co-receptors, such as glycosaminoglycans
(Clausen
et al., 2020; Kearns et al., 2022)
, blood group antigens
(Nguyen et al., 2021; Wu et al., 2023)
,
monosialylated gangliosides
(Nguyen et al., 2021)
, among others. Region 3 is a short, relatively
structured loop stretching between aa 411-426, located on the opposite side of the RBD relative
to Region 1, see Figure 1d.
Effect of the loss of N343 glycosylation on the structure of the WHu-1, alpha and beta
RBDs. Results obtained for the WHu-1 strain and for the alpha (B.1.1.7) and beta (B.1.351) VoCs
are discussed together due to their sequence and structure similarity, with alpha counting only
one mutation (N501Y) and beta three mutations (K417N, E484K and N501Y) relative to the
WHu1 RBD. Extensive sampling through conventional MD, i.e. 4 μs for the alpha and beta VoC and 8
μs with an additional 4 μs of Gaussian accelerated MD (GaMD) for the WHu-1 RBD, see Table
S.1, shows that the loss of N343 glycosylation induces a dramatic conformational change in the
RBD, where one or both helical loops flanking the hydrophobic beta sheet core pull towards each
other, see Figure 2. This conformational change can occur very rapidly upon removal of the
Nglycan or after a longer delay due to the complexity of the conformational energy landscape. The
data used for the analysis corresponds to systems that have reached structural stability, i.e.
equilibrium; we discarded the timeframes corresponding to conformational transitions.
different RBDs (including intact and endoF3 treated WHu-1 RBD and alpha and beta RBD) and the GM1os
(GlyTouCanID G46613JI) and GM2os (GlyTouCan-ID G61168WC) oligosaccharides. HEK293a samples
(Nguyen et al., 2021)
and
shown here as reference. HEK293b samples all carry FLAG and His tags and are shown for WHu-1 (glycosylated and
treated with endoF3 treated), alpha and beta sequences. Further details in Supplementary Material. Panel e) Predicted
complex between the WHu-1 RBD and GM1os, with GM1os represented with sticks in SNFG colours, the protein
represented with cartoons (cyan) and the N343 with sticks (white). Residues directly involved in the GM1os binding or
proximal are labelled and highlighted with sticks. All N343 glycosylated RBDs carry also a FA2G2 N-glycan
(GlyTouCan-ID G00998NI) at N331, which is not shown for clarity. Rendering done with VMD
(https://www.ks.uiuc.edu/Research/vmd/), KDE analysis with seaborn (https://seaborn.pydata.org/) and bar plot with
MS Excel.
To explore the effects of the loss of N343 glycosylation in the WHu-1 RBD, we started the MD
simulations from different conformations. In one set of conventional sampling MD trajectories
(MD1) and in the GaMD simulations the starting structure corresponds to an open RBD from an
MD equilibrated S ectodomain obtained in earlier work
(Harbison et al., 2022)
. In this system the
RBM is unfolded and retains the maximum degree of flexibility. MD2 was started from a
conformation corresponding to the ACE2-bound structure
(Lan et al., 2020)
. Results obtained from
MD2 and GaMD are entirely consistent with results from MD1, and thus are included as
Supplementary Material in Figure S.2. The GaMD simulation shows a lower degree of contact of
the N343 glycan with the aa 365-375 stretch of the opposite loop, see Figure 1c, because most
of the contacts are with residues further downstream from position 365. Nevertheless, the N343
remains engaged in a bridging conformation throughout the simulation. As shown by the RMSD
values distributions, represented through Kernel Density Estimates (KDE) in Figure 2, the
structure of Region 1 in the WHu-1 RBD is stable. In the glycosylated RBD the stability of Region
1 is largely due to the contribution of the bridging N343 FA2G2 glycan, forming hydrogen bonds
with the residues in loop aa 365-375 throughout the simulations, see Figures 1c and S.2.
Conversely the conformation of the RBM (Region 2) is very flexible in both glycosylated and non
glycosylated forms. Loss of N343 glycosylation triggers a conformational change in the Region 1
of the WHu-1 RBD, shown by a broader KDE peak in Figure 2a. This conformational change
ultimately triggers the complete detachment of the hydrophilic loop from Region 1, see Figure 1c,
through rupture of the non-covalent interactions network between Y351 (Region 1) and S469 or
T470 (Region 2) via of hydrogen bonding, and Y351 and L452 (Region 2) via CH-Ï€ stacking.
Structural changes in Region 3 upon loss of glycosylation at N343 appear to be negligible.
The starting structure used for the simulations of the alpha RBD derives from the ACE2-bound
conformation of the WHu-1 RBD (PDB 6M0J) modified with the N501Y mutation. The
reconstructed glycan at N343 interacts with the aa 365-375 throughout the entire trajectory, but it
adopts a stable conformation only after 830 ns, where we started collecting the data shown in
Figure 2b. Again, we see that the loss of glycosylation at N343 causes a swift conformational
change that brings the aa 335-345 and aa 365-375 loops closer together, see Figure 2b. This
conformational change involves primarily Region 1, and just like the previous case, it ultimately
determines the detachment of the hydrophilic patch from the Y351 in Region 1. Also shown by
the KDE plot in Figure 2b, a small conformational change in Region 3, which involves a partial
disruption and refolding of a helical turn, can be observed during the trajectory of the N343
glycosylated alpha RBD. As in the previous case the structure of Region 3 appears to be
unaffected by N343 glycosylation, at least within the sampling accumulated in this work.
In the beta RBD (starting structure from PDB 7LYN) the reconstructed N343 glycan adopts a
bridging conformation quite rapidly and retains this conformation throughout the trajectory with
only minor deviations. The corresponding RMSD values KDE distributions for Regions 1 to 3, see
Figure 2c, reflect this structural stability. The stability of the RBM (Region 2) is supported by
interactions between Y351 (Region 1) and the hydrophilic loop, as noted earlier. Loss of
glycosylation at N343 causes a rapid tightening of the RBD core helical loops towards each other,
which again in this case ultimately causes the detachment of the hydrophilic loop from Y351 in
Region 1 towards the end of the MD trajectory, i.e. after 1.9 μs of sampling.
Effect of the loss of N343 glycosylation on the binding affinity of GM1os /2 os for the WHu-1,
alpha and beta RBDs. In earlier work we presented a model of the complex between GM1 and
the WHu-1 RBD
(Garozzo et al., 2022)
to understand the role of monosialylated gangliosides as
co-receptors in SARS-CoV-2 infection
(Nguyen et al., 2021)
. The predicted binding site we
validated through extensive MD sampling is located at the junction between Region 1 and Region
2 of the WHu-1 RBD and it involves all the residues that stabilise the region, namely Y351, L452,
S469 and T470, see Figures 1c and 2e. As part of our investigation of glycan co-receptors binding
to the SARS-CoV-2 RBD, we used direct ESI-MS assay to determine the impact of the loss of
N343 glycosylation on GM1os and GM2os binding. Here, we used endoF3-treatment to trim down
the fucosylated biantennary and triantennary complex N-glycans into core nonfucosylated or
fucosylated GlcNAc (Gn or GnF, respectively). LC-MS analysis suggests that N-glycans on N343
but not N331 of WHu-1 RBD were trimmed down (Figure S.6). From the zero-charge mass
spectra of endoF3-treated WT RBD (Figure S.5), we performed glycan assignment (Table S.8)
and found that 31% of detected glycoforms contained Gn/GnF at N343 while the remaining was
the intact form. Affinity data in Figure 2d show that the enzymatic removal of the N343 glycan
from the WHu-1 RBD causes a complete loss of GM1os/2os binding, which is consistent with both,
the involvement of the junction between Regions 1 and 2 the binding and its allosteric control of
the RBM dynamics. Furthermore, while binding of GM1os and GM2os to the alpha RBD appears
to be slightly decreased relative to WHu-1, binding to the beta RBD is dramatically reduced. We
can reconcile this finding with the mutation E484K in beta, which changes the key interaction
between E484 and GM1os, see Figure 2 and with changes in structure and dynamics of the RBM
terminal hairpin induced by mutations
(Williams et al., 2022)
, which have also been suggested to
affect the S opening kinetics
(Y. Wang et al., 2021)
.
Effects of N343 glycosylation on the structure of the delta RBD. The delta (B.1.617.2) RBD
carries two mutations, namely L452R and T478K, relative to the WHu-1 strain. The open RBD in
the cryo-EM structure PDB 7V7Q was used as starting conformation for the MD simulations of
both the glycosylated, and the non-glycosylated delta RBDs. To understand how the mutations in
delta affect the RBD structure and modulate the response to the loss of glycosylation at N343, we
ran two uncorrelated conventional MD simulations (2 μs) and one GaMD simulation (2 μs) for
both the glycosylated and non-glycosylated systems, for a total (cumulative) sampling of 12 μs.
Results are shown in Figure 3. In the glycosylated delta RBD the N343 glycan is observed to be
much more dynamic than in the WHu-1, alpha and beta RBDs, engaging in contacts with different
regions of the RBD in addition to the loop aa 365-375. In response to these fluctuations the
of the delta RBD are represented with VDW spheres partially visible under the N-glycans overlay. Panel f) Insert
showing the junction between Regions 1 and 2 from the left-hand side of the RBD in panel e). The residues involved
in the network solidifying the junction are highlighted with sticks and labelled. Panel f) Affinities (1/Kd, x103 M-1) for
interactions between GM1os (GlyTouCan-ID G46613JI) and GM2os (GlyTouCan-ID G61168WC) oligosaccharides and
the intact and endoF3-treated delta RBD and omicron RBD. Rendering done with VMD
(https://www.ks.uiuc.edu/Research/vmd/), KDE analysis with seaborn (https://seaborn.pydata.org/) and bar plot with
MS Excel.
The effect of the loss of glycosylation at N343 on the delta RBD was assessed by running two
uncorrelated MD simulations, one by conventional sampling (MD1 of 3 μs) and the other through
enhanced sampling (GaMD of 2 μs). As a consequence of the L452R mutation shown in Figure
3, the tightening of the helical loops aa 335-345 and aa 365-375 over the hydrophobic core of the
RBD occurring upon loss of glycosylation at N343 does not affect the structure and dynamic of
the junction between Regions 1 and 2, see Figure S.3. Results of the conventional MD simulation
show that the tightening of the loops is mainly achieved by a larger displacement of the aa 365
375 loop rather than of Region 1, while the GaMD results show tightening of both loops, see
Figure S.3. In all simulations the structure of the junction between Regions 1 and 2 remains
undisturbed, with no detachment of the hydrophilic patch within the sampling we collected.
Effect of the loss of N343 glycosylation on the binding affinity of GM1/2 for the delta RBD.
To examine the effect of N343 glycosylation on glycan binding of delta RBD, we used the direct
ESI-MS assay to quantify the binding affinities between endo F3-treated delta and GM1os and
GM2os. From the zero-charge mass spectra of endoF3-treated RBD, see Figure S.5, we
performed glycan assignment, see Table S.9, and found that both N331 and N343 glycans were
trimmed down to Gn/GnF. Direct ESI-MS data in Figure 3f show no loss of GM1/2 binding in the
delta RBD upon loss of N343 glycosylation, which further supports the involvement of the Region
1 to 2 junction in sialylated glycans recognition.
Effects of N343 glycosylation on the RBD structure in the omicron BA.1 SARS-CoV-2. The
omicron BA.1 RBD carries 15 mutations relative to the WHu-1 strain, namely S371L, S373P,
S375F, K417N, N440K, G446S, S477N, T478K, E484A, Q493R, G496S, Q498R, N501Y, Y505H,
and T547K. The S371L, S373P, and S375F mutations, retained in all omicron VoCs including the
most recently circulating XBB.1.5, EG.5.1 and BA.2.86, remove all hydroxyl sidechains that we
have seen being involved in hydrogen bonding interactions with the N343 glycan in the WHu-1,
alpha, beta and delta RBDs, see Figures 1c and S.3. We investigated the effects of the loss of
glycosylation at N343 in the structure of the BA.1 RBD through two sets of uncorrelated
conventional MD simulations (MD1 and MD2) and one set of GaMD, with total cumulative
simulation time of 12 μs. Starting structures correspond to the open RBD in PDB 7QO7 (MD1)
and in PDB 7WVN (MD2 and GaMD), where the N343 glycan was reconstructed in different
conformations, depending on the spatial orientation of the N343 sidechain. The results of the MD1
and GaMD simulations show that, despite the S371L, S373P, and S375F mutations, the N343
glycan is still forms stable contacts with the aa 365-375 loop, see Figures 1c and S.3, and these
interactions contribute to the stability of the RBD structure, see Figure 4. In the starting structure
we used for MD2 the N343 glycan was built with the core pentasaccharide pointing away from
the RBD hydrophobic core. Consequently, the N-glycan adopts different transient conformations
during the MD2 trajectory, which terminate with an interaction with the hydrophobic interior of the
RBD and with the N331 glycan, see Figure S.6. In all simulations the loss of glycosylation at N343
causes a tightening of the aa 335-345 and aa 365-375 loops, which in omicron is stabilised by
more efficient packing of the aa 365-375 loop within the hydrophobic core, driven by the
embedding of the L371 and F375 sidechains. The non-glycosylated RBD adopts a stable
conformation where we do not see a detachment of the hydrophilic patch. The stabilising effect
of the aa 365-375 loop mutations in omicron could not be tested by means of affinity for GM1os/2os
as omicron (BA.1) binds those epitopes only weakly, see Figure 3f. Based on the binding site we
predicted by MD simulations in earlier work
(Garozzo et al., 2022)
, see Figures 1d and 2e, and
as observed for beta, the loss of E484 due to the E484A mutation in omicron may negate
GM1os/2os binding.
The stability of the RBD structure is further enhanced by the presence of an additional
glycosylation site at N354, which appeared in the recently detected omicron BA.2.86 8pirola9
variant. As shown in Figure 4d-f, the N-glycans at N343 and N354 are tightly intertwined
throughout the trajectory stabilising Region 1, also shielding the area very effectively. The
presence of an additional N-glycosylation site at N354 also changes the conformation of the loop
that hosts the site relative to the BA.1 starting structure we used as a template to run the MD
simulation, see Figure 4f. To note, based on earlier glycoproteomics analysis
(Newby et al., 2022;
Watanabe et al., 2020a)
and on the exposure to the solvent of the reconstructed glycan structure
at N354, we chose to occupy all glycosylation sites with FA2G2 N-glycans.
Discussion
Quantifying the role of glycosylation in protein folding and structural stability is a complex task
due to the dynamic nature of the glycan structures
(Fadda, 2022; Woods, 2018)
and to the
microand macro-heterogeneity in their protein functionalization
(ÄŒaval et al., 2021; Riley et al., 2019;
Struwe and Robinson, 2019; Thaysen-Andersen and Packer, 2012; Zacchi and Schulz, 2016)
that hinder characterization. Yet, the fact that protein folding occurs within a context where
glycosylation types and occupancy can change on the fly, suggests that not all glycosylation sites
are essential for the protein to achieve and retain a native fold and that those sites may be
displaced without consequences to function. In this work we investigated the structural role of the
N-glycosylation at N343 in the SARS-CoV-2 S RBD, one of the most highly conserved sites in the
viral phylogeny
(Harbison et al., 2022)
. Extensive MD simulations in this and in earlier work by us
and others
(Casalino et al., 2020; Grant et al., 2020; Harbison et al., 2022; Sikora et al., 2021)
show that the RBD core is effectively shielded by this glycan. Furthermore, the N343 glycan has
been shown to be mechanistically involved in the opening and closing of the S
(Sztain et al., 2021)
,
making this glycosylation site functionally essential towards viral infection.
In this work we performed over 45 μs of cumulative MD sampling with both conventional and
enhanced schemes to show that the N343 glycan also plays a fundamental structural role in the
WHu-1 SARS-CoV-2 and that this role has changed in the variants circulating thus far. While we
cannot gauge how fundamental is N343 glycosylation towards RBD folding, we see that the
amphipathic nature of the complex N-glycan
(Watanabe et al., 2020a)
at N343 enhances the
stability of the RBD architecture, bridging between the two partially helical loops that frame a
highly hydrophobic beta sheet core. To note, we determined the same bridging structures also for
oligomannose types N-glycans at N343 in earlier work
(Harbison et al., 2022)
. In all variants we
observe that the removal of the glycan at N343 triggers a tightening of the loops in a response
likely aimed at limiting access of water into the hydrophobic core. In WHu-1, alpha and beta RBDs
this event allosterically controls the dynamics of the RBM, ultimately causing the detachment of
the hydrophilic patch and misfolding from the ACE2-recognized conformation. These results are
in agreement with the drastic reduction of viral infectivity observed upon deletion of both N331
and N343 glycosylation in the WHu-1 strain
(Li et al., 2020)
, where loss of structure may add to
the loss of function through gating
(Sztain et al., 2021)
or vice versa.
As a functional assay to support this molecular insight, we determined how the binding affinity of
the RBD for the oligosaccharides of the monosialylated gangliosides GM1os and GM2os is
modulated by N343 glycosylation. These were shown in earlier work by us and others to function
as co-receptors in WHu-1 infection
(Nguyen et al., 2021)
. We predicted through extensive MD
sampling that GM1os and GM2os bind the RBD into a site corresponding precisely to the location
occupied by the 6-arm of an ancestral N-glycan at N370
(Garozzo et al., 2022; Harbison et al.,
2022)
. Note, the N370 site is still occupied in zoonotic sabercoviruses
(Allen et al., 2023)
. The
GM1os binding site, see Figure 2e, is located precisely at the junction between Regions 1 and 2,
which is disrupted by the loss of N343 glycosylation in WHu-1. Accordingly, we find that enzymatic
removal of the N343 glycan abolishes GM1os and GM2os binding in the WHu-1 RBD, see Figure
2d. While we expect a similar loss of binding in alpha, within the context of a lower affinity relative
to the WHu-1 RBD, we find that the beta RBD does not bind GM1os and GM2os, regardless of its
glycosylation state. Based on the structure of the GM1-RBD complex we identified, see Figure
2e, where E484 represents a key contact to the oligosaccharides, the mutation of E484K in beta
may be key to the loss of binding, together with change in the RBM kinetics linked to this mutation
and to variations within the same region
(Y. Wang et al., 2021; Williams et al., 2022)
.
In the delta variant we observed that the L452R mutation is responsible for an increased structural
stability of the RBD, reinforcing the non-covalent interactions network between Region 1 and the
RBM. Indeed, the tightening of the loops occurring upon loss of N343 glycosylation does not
trigger a misfolding of the RBM, see Figure 3. Accordingly, we observe that the delta RBD with
the trimming down of N331 and N343 glycans shows no significant change in binding affinity for
GM1os and GM2os relative to the fully glycosylated form, see Figure 3f.
In all omicron variants, including all the currently circulating VoCs and VUMs, the loop aa
365375 that the N343 glycan hooks on, carries similar mutations, with the highly conserved S371,
S373 and S375 all mutated to hydrophobic residues, see Figure S.1. Our MD results on the BA.1
and BA.2.86 RBDs show that hydrophobic residues at positions 371, 372 and 373 can pack within
the RBD core, while leading to a loop structure that can support the N343 glycan branches through
interactions with the backbone, see Figure 4. We have shown for all variants that the contacts
between the N343 glycan and the aa 365-375 stretch of the opposite loop are fairly equally
distributed, between hydrophilic (hydrogen bonding) and hydrophobic (dispersion or van der
Waals) type interactions, see Figure 1c. Therefore, it is expected that the loss of anchoring
hydrogen bonding residues can be supported through other interactions. Within this context, the
removal of the N343 glycan does still cause a tightening of the loops, yet through a different
mechanism relative to the other variants that ultimately does not appear to affect the RBM
dynamics. As in beta, for omicron there is negligible binding of the N343 glycosylated RBD to
GM1os and GM2os, likely due to the E484A mutation, which would deny a key interaction within
the predicted binding site, see Figure 2e.
Conclusions
Taken together, our results show that since the WHu-1, alpha and beta strains, the RBD has
evolved to make the N-glycosylation site at N343 structurally dispensable. Within this framework,
provided that an N-glycosylation site in the immediate vicinity of N343 is necessary for folding and
for function, a shift of the site within the sequence can potentially occur. Such a modification may
negatively affect recognition and binding by neutralising antibodies
(Liu et al., 2021; Piccoli et al.,
2020; Pinto et al., 2020)
and thus promote evasion. We have also shown for the BA.2.86 that the
new glycosylation site at N354 can effectively contribute to the stability of Region 1, while
significantly increasing shielding.
Moreover, we show that specific VoCs lost affinity for monosialylated ganglioside
oligosaccharides with a trend in agreement with a binding site located at the junction between
Region 1 and the RBM, which is part of the N370 glycan binding cleft on the RBD
(Harbison et al.,
2022)
. This conclusion is further supported by how binding affinities for GM1os and GM2os change
upon the loss of N343 glycosylation, in agreement with the MD results. Further to this, as
mutations we identified dampened binding to monosialylated ganglioside oligosaccharides, it is
also possible that further mutations may switch the affinity back on or determine a shift of
preference of the RBD towards other glycans that can still be recognised within the N370 cleft.
Further work is ongoing in this area.
Finally, the results from this work point to the importance of understanding the impact of
Nglycosylation in protein structure and stability, with immediate consequences to COVID-19
vaccine design. Indeed, earlier work shows SARS-CoV-2 S-based protein vaccines with
increased efficacy due to the removal of N-glycans
(Huang et al., 2022)
, and of RBD-based
vaccines in use and under development
(Cohen et al., 2022; Más-Bermejo et al., 2022;
ValdesBalbin et al., 2021)
that may be designed with and without N-glycans. The design of such
constructs may benefit from understanding which N-glycosylation sites are structurally essential
and which are dispensable.
Acknowledgements
The Science Foundation of Ireland (SFI) Frontiers for the Future Programme is gratefully
acknowledged for financial support of CMI postdoctoral training (20/FFP-P/8809). The opinions,
findings, and conclusions or recommendations expressed in this material are those of the
author(s) and do not necessarily reflect the views of the Science Foundation Ireland. CMI and EF
gratefully acknowledge ORACLE for Research for the generous allocation of computational and
data storage resources. CAF acknowledges the Irish Research Council (IRC) for funding through
the Government of Ireland Postgraduate Scholarship Programme (GOIPG/201912212). CMI,
CAF, AMH and EF acknowledge the Irish Centre for High-End Computing (ICHEC) for generous
allocation of computational resources. Large part of the computational work described here was
run on the HPC cluster kay at ICHEC, soon to be decommissioned. We would like to take this
opportunity to thank kay for her invaluable service to the Irish scientific computing community,
together with all the staff at ICHEC that took great care of her during the past 5 years. JSK and
LN acknowledge the Natural Sciences and Engineering Research Council of Canada, the Canada
Foundation for Innovation and the Alberta Innovation and Advanced Education Research
Capacity Program for funding. We are grateful to the members of the NRC-HHT Mammalian Cell
Expression Section for their contribution to the cloning, expression and purification of the various
recombinant proteins used in this study and to the Pandemic Response Challenge Program of
the National Research Council of Canada for its financial support.