Classification of glycans is an organizational principle that takes an array of complex carbohydrate structures that appear random and chaotic and re-organizes it in order to be biologically useful. The principal basis of this organization is the chemical linkage between the glycan and the protein or lipid. Glycans that bind to asparagine are called N-glycans and are derived from a common biosynthetic origin, whereas glycans that attach to serine or threonine residues are referred to as O-glycans and are formed through multiple, mostly post-folding, pathways. In addition to these two major categories, linear glycosaminoglycans that predominate in the extracellular space, GPI anchors that anchor proteins to lipids in membranes and free oligosaccharides which serve signaling roles are also often included. This classification is useful because members of each group are created in a defined biosynthetic manner, tend to have conserved structural features and share common biological functions such as how they influence the fate of the glycoproteins and cells. Thus, classification of a glycan can provide important insights to the researcher in terms of predicting how it will behave and how to best analyze it, as well as how to interpret disease-related changes in glycans. Glycomics is evolving from a descriptive science that catalogs structures towards a functional one that exploits sugars for a variety of applications and this hierarchy is important for translating the complexity of sugars into therapeutic and diagnostic applications.
Classification scheme(s) for glycans was proposed to bring some order into the diverse world of carbohydrates, by clustering them into classes that have common structural and biosynthetic properties. The first and main classification is based on the amino acid residue to which the glycan is attached: N-linked if the nucleation occurs on an asparagine, in a consensus sequence (sequon), and is assembled on a lipid-linked oligosaccharide (on dolichol pyrophosphate); or O-linked if the nucleation occurs on a serine or threonine, in a no consensus sequence, and is not assembled on a lipid-linked oligosaccharide precursor. The N- and O- types differ also in their biological role: N-glycans act as quality-control tags and as generic molecular recognition structures, while O-glycans act as a fluid and conditional tuning of surface properties. There are other classes of glycans: glycosaminoglycans, linear polysaccharides of repeating disaccharide units that make up the backbone of proteoglycans, and GPI anchors that post-translationally attach a protein to the membrane through a glycolipid chain. The different classes have some characteristic structures: for example, all N-glycans have a common pentasaccharide core, to which either high-mannose, hybrid or complex branches may be added; O-glycans have at least eight core structures, to which different extensions can be added; and glycosaminoglycans are long (polymer-like) and may be sulfated. The classification must be considered before the right analytical methods can be chosen, a potential biological role envisaged, or disease-associated changes that can be used as biomarkers can be anticipated.
Classification of glycans allows the large structural heterogeneity to be mapped into functional terms. This is for two reasons: first, different categories of glycans have distinct logic of biosynthesis, and second, each class of glycans confers a specific property on its protein partner. For N-glycans, because of a common precursor and a unique processing pathway, the pattern of trimming the intermediate structures during biosynthesis constitutes a built-in quality-control mechanism in the endoplasmic reticulum (ER). The trimming intermediates can both recruit chaperones and display protein degradation signals. Glycans that depart from this normal processing pathway are swiftly identified as misfolded, and thus glycan classification allows congenital disorders of glycosylation to be diagnosed, misfolded therapeutic proteins to be detected, and can also be applied for proper folding of therapeutic glycoproteins. In O-glycans, the GalNAc-type core structures represent the main forms that are present in epithelial tissues and mucins, and these glycans form viscoelastic barriers that can also present decoy targets for immune receptors, whereas O-GlcNAc cycling on nuclear proteins modulates the activity of transcription factors, and thus links metabolism to transcription. Glycosaminoglycan classification, for example by distinguishing heparan sulfate from chondroitin sulfate, allows specific binding of growth factors and morphogens to be predicted, and therefore developmental processes or metastatic behaviors can be predicted. Classification of GPI-anchored glycans allows a subpopulation of proteins to be identified, whose membrane presence is transient and regulated. Finally, classification allows comparison between different organisms, organs or disease states and thus can be used to understand evolutionary processes and disease progression. Glycan analysis would be unmanageable if the glycome were not organized in a systematic way. Classification of glycans thus represents an experimental pre-requisite, guiding the analytical approach, predicting biological effects, and identifying therapeutic targets.
Diversity is a key feature of glycans. The structures in which glycans occur are extremely diverse, due to the combinatorial nature of their biosynthesis and degradation. This diversity contributes to the functional diversity of glycans, which is far greater than the sequence diversity encoded in the genome. For instance, N-glycans have three broad subclasses. High-mannose glycans, which preserve most of the fourteen-sugar precursor, are recognized as quality-control tags that identify proteins resident in the endoplasmic reticulum (ER). Hybrid glycans are mixtures of trimmed cores with processed antennae. These intermediate structures are abundant in the transient secretory compartments that process and transport glycoproteins. Complex glycans have been heavily remodeled to add GlcNAc-branched antennae terminated with galactose, sialic acid, and fucose. These glycans display recognition epitopes that are read by lectins distributed throughout the circulation and interstitial fluid. In addition to this macroheterogeneity, glycans at a given site on a protein may display microheterogeneity, with a mixture of glycoforms (a subset of glycans that differ in structure) resulting from competition between glycosidase and glycosyltransferase enzymes. The resulting ensemble of molecules will have a range of affinities for glycan-binding proteins and distinct half-lives. O-glycans, which are even more diverse than N-glycans, are assembled by several initial transferases, which create eight possible core structures that are then diversely elaborated. For example, poly-N-acetyllactosamine repeats can be added to the core, and the final glycans can be sulfated or sialylated. The mucin-type O-glycan arrays that result predominate in epithelial surfaces, forming barriers to the environment; they can also be used by pathogens as landing pads to gain entry to cells. The site-specific addition and removal of a single O-GlcNAc on nucleocytoplasmic proteins can switch transcription factors on and off. Glycosaminoglycans are even more diverse still: these are linear polymers, typically consisting of repeating disaccharides that are sulfated at different degrees and positions along the chain. This produces binding sites with high affinities for a range of morphogens that regulate the architecture of the extracellular matrix.
N-glycans are one of the major classes of glycans attached to proteins. Their structure and functional utilization is unique compared to other types of glycans. These structures are always attached to the nitrogen atom of an asparagine side chain (hence the "N" in "N-glycan") when it is part of the amino acid sequence Asn-X-Ser/Thr where X can be any amino acid except proline. This type of linkage to a protein is also dependent on the movement of that protein into the secretory pathway, so N-glycans are found on proteins that are to be transported to the cell surface or secreted from the cell. N-glycans are added en bloc as a precursor to a protein by a lipid called dolichol-phosphate and then modified in the Golgi. The precursor is always a tetradecasaccharide, and this process of N-glycan assembly is one of the most evolutionarily conserved pathways in eukaryotes. There are many different final structures that can be synthesized that all have the same core pentasaccharide, but different antennae in number, branching and terminal sugars. These structures are synthesized in a manner of an assembly line in the Golgi, where the order of enzymes determine the final structure. Since this process is highly dependent on the protein structure and cell type, it can be altered by differentiation, metabolic changes and during disease states. For this reason, N-glycans can be used as a signaling mechanism and must be carefully characterized in development, immune system function and cancer.
Fig. 1 Schematic Illustration of Human Glycoprotein Modification.1,5
The common core structure Manα1-6(Manα1-3)Manβ1-4GlcNAcβ1-4GlcNAcβ1-Asn of all subclasses of N-glycans is fully conserved on the precursor protein in the ER. Three major topological classes of N-glycans can be defined on the basis of this core: high-mannose N-glycans (MNG), which retain most of the mannose residues from the precursor protein; hybrid N-glycans, which have hybrid mannose-rich and processed antennae; and complex N-glycans, which are the result of extensive remodeling of GlcNAc-branched antennae and elaboration of termini with galactose, sialic acid and/or fucose residues. Processing of antennae produces biantennary, triantennary or tetra-antennary glycans, each of which can feature multiple terminal sugars capable of binding lectin receptors with differing avidity. Attachment of fucose α1-6 to the core innermost GlcNAc residue (core fucosylation) also modulates affinity for different receptors. Addition of bisecting GlcNAc by a β1,4-transferrin GlcNAc-transferase (T-synthase) alters glycan flexibility, and can impact antibody effector functions. Glycans are often further modified by terminal sialylation, sulfation, and/or additional fucosylation, adding negatively charged residues that repel nonspecific interactions but also provide specific ligands for selectins and siglecs. The combinatorial expansion of structural variation among glycans produces microheterogeneity at each glycosylation site, which is represented by populations of glycoforms with different relative abundances in a context-dependent manner, including during development, differentiation, and disease, thus encoding information about the state of the cell. Glycans are metabolically regulated but not genetically encoded.
N-glycans are multi-purpose molecular switches that control protein folding, intracellular and cell-surface trafficking, and intercellular signaling by steric, chemical and recognition means. In the ER lumen, the added oligosaccharide serves as a folding sensor. When a nascent polypeptide chain is not in its native conformation, the added glucose moieties are trimmed one at a time to leave a monoglucosylated recognition epitope that interacts with molecular chaperones and prevents its transport out of the ER. Extended duration of misfolded proteins in the ER results in the removal of their glucose residues which exposes mannose residues for recognition by the ERAD system. This tags the protein for delivery to the cytosol where it is degraded by the proteasome, thereby preventing it from accumulating and becoming toxic. Folded glycoproteins are sorted according to their mature N-glycan to specific subcellular destinations. Mannose-6-phosphate residues added to lysosomal enzymes are used to target them to the lysosome. Complex-type N-glycans containing a bisecting GlcNAc and terminal sialic acids are preferentially sorted to the plasma membrane for recycling. At the cell surface, N-glycans sterically regulate ligand-receptor dimerisation, as well as mask the binding of phosphatases to receptors in order to prolong or amplify signalling cascades involving kinases. Extracellularly, sialylation or fucosylation of terminal galactose residues on circulating glycoproteins can make them ligands for anti-inflammatory inhibitory lectins, while galactose or mannose residues that become exposed on injured or dying cells can promote complement fixation and opsonisation for phagocyte clearance. The presence of N-glycans in dense clusters on T-cell receptors can also affect their activation thresholds, as well as tune Fc effector functions in antibodies by their control of Fcγ receptor interactions. Increased branching and hypersialylation of N-glycans is common in cancer and can promote metastasis by stabilizing active conformations of growth-factor receptors and masking tumor cells from immune detection.
O-glycans are a major family of glycoconjugates whose anomeric carbon is linked to the hydroxyl oxygen of serine or threonine residues in proteins. In contrast to N-linked glycans, which are transferred as a completed structure from a lipid carrier in the ER, O-glycans are added in a stepwise fashion in the Golgi, often after protein folding is complete. This results in much more flexible regulation and a lack of consensus sequence, allowing rapid reorganization of cell surface chemistry in response to environmental changes such as development, inflammation, and nutrient availability. Mucin-type O-glycans, which start with GalNAc and branch into many core structures, make up the majority of O-glycans, while other types such as O-linked GlcNAc, fucose and glucose exist and often function in the nucleus or cytoplasm. These structures contribute to the diversity of functions that glycans have, such as lubrication, barrier formation, and modulation of cell-surface receptors.
Fig. 2 Schematic representation of O-glycan cores linked to mucin.2,5
The blueprint of O-glycans is provided by the still poorly understood process of their priming by the transfer of the first sugar (most often N-acetylgalactosamine) onto the hydroxyl group of serine or threonine in a post-translational Golgi reaction catalyzed by tissue-specific sets of polypeptide GalNAc-transferases. In contrast to N-linked glycosylation, the lack of a strict sequon requirement means that O-glycans are randomly distributed over accessible loops and proline-rich linkers. From the priming GalNAc residue, O-glycan biosynthesis diverges into at least four major O-glycan core structures (Core 1-4) that are further extended in various ways. Core 1 or T-antigen is a Galβ1-3GalNAc disaccharide, present on all cell types and often modified by sialic acid (forming negatively charged epitopes). Core 2 structures are characterized by the addition of a GlcNAcβ1-6 branch that provides a site for poly-N-acetyllactosamine (LacNAc) extension and eventual fucose or sialic acid capping, further increasing the structural diversity. Core 3 and 4 structures are primarily found on epithelial mucins and form large clustered O-glycan microdomains. The lack of a pre-assembled precursor for O-glycans means that chain length and branching are probabilistic, which together with alternative sialic acid and fucose caps on extended chains, and with post-translational sulfation and O-acetylation, gives rise to a huge structural combinatorial space with very subtle tuning of protein conformation, protease susceptibility and lectin binding.
O-glycans serve a variety of biological functions. One common function is steric shielding of a protein's peptide backbone from enzymatic cleavage: an O-glycan cluster consisting of sialylated O-glycans can sterically protect nearby proteolytic cleavage sites, increasing the half-life of secreted cytokines, receptors and peptide hormones. For example, O-glycosylation of tumor necrosis factor α is known to control ectodomain shedding by ADAM proteases, a regulatory mechanism that can directly affect inflammatory responses. Clusters of O-glycans on mucins, which form a viscoelastic hydrogel on epithelial cell surfaces, can trap pathogens and serve as a non-specific barrier to microbial penetration, while also lubricating and hydrating the epithelium. The terminal sugars of the glycans can also directly bind pathogens as a decoy ligand to their adhesins, diverting them away from other cell surface receptors. In the context of the immune system, selectin ligands are heavily O-glycosylated, and the composition of their O-glycans (including sialylation and fucosylation) is important for regulation of leukocyte rolling and extravasation during inflammation. In the nucleus, O-GlcNAc is another type of O-glycan modification that can modulate activity of transcription factors by competitive inhibition of phosphorylation, thereby linking metabolic state to gene regulation. Abnormal O-glycosylation of certain O-glycans such as truncated Tn antigens on tumor cells can hide these cells from natural killer cell activity by binding to their inhibitory lectins.
In addition to the well-known N- and O-linked glycans, there are also some less common but biologically important modifications that can be added to proteins. These include C-glycans, S-glycans, glycosaminoglycan linkages, and GPI anchors. Each of these types of linkages are not compatible with the classical biosynthetic machinery and have been selected for in biology for very specific chemical properties. C-glycans are connected to proteins through a carbon–carbon bond to the indole ring of a tryptophan residue, making it more difficult for the glycan to be cleaved enzymatically or hydrolytically. These have been used in proteins that require high structural stability. S-glycans are sugars that are connected to cysteine residues via a sulfur atom. These are much less common and examples of their use are still limited to a few specific cases, in which the reductive nature of the glycoconjugate may be used to tune protein activity. Glycosaminoglycans (GAGs) are long chains of repeating disaccharides that are added to serine residues of proteoglycans. They are used to sequester growth factors, build up morphogen gradients, and provide structural support in connective tissue. GPI anchors are complex glycolipids added to proteins and tethers them to the outer side of the plasma membrane. Proteins can be released by the action of phospholipases and are used in the release of signaling molecules as well as antigenic variation in parasites. Thus, it can be seen that glycosylation can be achieved using more than N- and O-linkages to proteins. These types of glycans are much less common than N- or O-linked glycans and require chemical cleavage to release, as they cannot be released by PNGase F or by β-elimination. As such, their detection is more challenging and often requires chemical methods or top-down MS.
C-glycans are recognized by the rare carbon–carbon bond they form. The unusual linkage connects the sugar (mannose) directly to the C2 carbon of the indole ring of a tryptophan side chain. The C–C bond is more stable and less reactive than N- or O-linked glycans and is thus enzyme inert. Because of this, the C-mannosylation modification makes the glycan one of the most stable to acid hydrolysis and proteolytic degradation. The chemical properties of this glycan have been found to be useful in proteins that are under a lot of conformational stress or when it is secreted into harsh extracellular environments. The modification is created by an elusive biosynthetic pathway that has been shown to be co-translational in the ER. In this way, the sugar is added to the emerging polypeptide chain before it has folded. C-glycosylation functionally has been found to cause stabilization of tertiary structure by limiting conformational freedom of small regions, effectively stapling parts of the protein together and limiting movement of looped structures. Mutation of C-mannosylation sites can lead to protein misfolding and degradation, pointing to a structural necessity of the modification. C-mannosylation has also been shown to affect protein–protein interactions by affecting the electronic properties of the tryptophan residue and by how the indole ring is sterically presented to the protein interaction surface. This can cause changes to the binding of a receptor–ligand interaction. Mass spectrometric methods have been developed to detect C-glycans as they are not released under typical methods and the C–C bond needs to be broken for detection, either through top-down sequencing or through chemical hydrolysis methods. In addition, this glycan is rare when compared to N- and O-glycans, though because of its rarity it is considered to be a different post-translational approach to increase stability and regulatory diversity of a protein, especially those that are in more extreme conditions than where O- and N-glycans would be degraded.
S-glycans (glycoconjugates) form another class of glycans that are linked to a protein through a thioether or thioester bond between a sugar and the sulfur of a cysteine. S-glycosidic bonds are more resistant to hydrolysis, and so are more stable than O-glycosidic bonds. Since the linkage is made to a cysteine residue, the resulting glycan is redox-sensitive. The site specificity of S-glycosylation is currently unknown, but it is believed to occur through the action of specific transferases. S-glycosylation is thought to take place in the secretory pathway after formation of the thioester. Sulfur in its oxidized form has a strong affinity for many biological molecules, including the sulfhydryl groups of cysteine. S-glycans have been proposed to act as redox rheostats by sensing oxidizing conditions and altering the properties of the target protein in response. S-glycosylation can serve as a steric shield, which inhibits access to phosphatases that would otherwise dephosphorylate and turn off a receptor kinase after activation by an extracellular ligand. S-glycosylation can also act as a folding modulator, with S-glycans forming disulfide-like shields that reduce the conformational entropy of the unfolded state and drive folding of the protein toward its native conformation. Targeted modification of cysteines with S-glycans has also been observed to impact a range of disorders including oxidative stress, and neurodegenerative diseases in which aggregation plays a role. Analytical techniques for S-glycans are still in development. S-glycans are not readily released by treatment with alkali or enzymes. The current techniques for S-glycan analysis involve chemical reduction of the linkage to release the glycan for mass spectrometric analysis.
Classification of glycans into structural classes allows us to change the raw complexity of carbohydrates into meaningful biomedical information. Classification allows the use of methods specific to a class of glycans, the prediction of structure based on biosynthetic pathways and functional consequences, as well as a more mechanistic understanding of disease-related changes. It also helps to differentiate glycans in the glycome, which may have very different roles in disease and drug responses: N-glycans and O-glycans have very different functions as do glycosaminoglycans and GPI-anchors. Classification can be very useful in clinical translation. For example, when using N-glycans from serum proteins as biomarkers for a disease, this will give information about systemic metabolic changes in the body. If O-glycans on mucins were used as biomarkers, this would give information about changes at the barrier and the state of inflammation. Classification also allows the development of therapeutic proteins to consider the different classes of glycans and their effects on drug properties such as half-life, immunogenicity and interactions with other proteins. In vaccine development, the identification of specific antigenic glycans as being N-linked high-mannose or O-linked mucin-type structures, for example, can drive decisions in immunogen selection and adjuvant development. Glycan class will also be required as part of the approval process of future biologics, as unclassified glycoforms would be undefined process attributes and quality attributes. Glycan classification has application in precision diagnostics and diagnostics in general, process analytics, and rational drug design.
Classification of glycans into distinct classes has provided a framework for understanding the pathophysiology of diseases and the identification of potential prognostic biomarkers. In cancer, classification has shown that cancer cells often display class-specific alterations in glycosylation patterns, with N-glycans becoming more branched and sialylated to support immune evasion and metastasis, while O-glycans may become truncated, revealing oncofetal antigens that act as decoy ligands for inhibitory lectins on NK cells. This allows the use of liquid biopsy to detect serum N-glycan profiles as a non-invasive measure of tumor burden and tissue analysis of mucin O-glycans for local invasion assessment. In the context of autoimmune diseases, classification has been useful in identifying alterations in IgG Fc N-glycosylation, such as decreased galactosylation and sialylation, which can enhance pro-inflammatory effector functions and contribute to disease flares in conditions like lupus and rheumatoid arthritis. In parallel, O-glycan analysis of proteins in synovial fluid can reveal inflammation-specific sulfation patterns that correlate with tissue damage. Neurodegenerative disease research has shown class-specific alterations in glycosylation, such as increased core fucosylation of N-glycans on proteins found in the cerebrospinal fluid, which has been correlated with amyloid plaque burden, while O-GlcNAc cycling on tau proteins within the nucleus is dysregulated, affecting their aggregation propensity. In infectious diseases, classification has aided in understanding how pathogens use glycan shields to evade host defenses, with viruses often displaying high-mannose N-glycans on envelope proteins to escape neutralizing antibodies, while bacteria may utilize O-glycans that mimic host structures to avoid immune detection. By distinguishing these glycan classes, researchers can better attribute functional significance to specific glycan motifs, trace their biosynthetic origins to changes in enzyme expression, and potentially develop class-targeted therapies to restore normal glycosylation patterns, transforming the classification into a diagnostic and therapeutic roadmap.
The classification of glycans has important implications for biopharmaceuticals, with each glycan class needing to be controlled to ensure therapeutic safety and efficacy. For instance, in monoclonal antibody production, N-glycan classification is critical: afucosylated N-glycans on the Fc region are often enriched to increase antibody-dependent cellular cytotoxicity for cancer treatments, while core-fucosylated and highly sialylated glycoforms are desired for anti-inflammatory therapies to avoid immune activation. Class-specific quantification methods, like N-glycan profiling by hydrophilic interaction chromatography and O-glycan analysis, are therefore used in process analytics to monitor antibody quality. In biosimilars, equivalence must be demonstrated by showing that the N-, O-, and C-glycan classes match the reference product within regulatory-specified tolerances, as class-specific variations can impact clearance rates and clinical performance. In vaccine development, glycan classification is used to understand pathogen glycosylation patterns, with strategies like targeting high-mannose N-glycans on viral envelopes for the design of glycan-modified immunogens that reveal conserved epitopes, or analyzing O-glycan mimicry in bacterial capsules to select non-self-mimicking antigens. Glycoengineering approaches also leverage this class knowledge to reroute biosynthesis, such as by knocking out fucosyltransferases to bias N-glycans towards afucosylated structures, overexpressing sialyltransferases to increase O-glycan terminal sialylation on mucosal vaccines, or engineering C-glycosylation for peptide epitope stabilization. Class knowledge also predicts immunogenic risk, as N-glycolylneuraminic acid, a non-human sialic acid belonging to a distinct class, should be absent from biologics produced in non-human cells to avoid anti-Gal immune responses. As regulatory agencies increasingly require glycan class characterization, the integration of glycan classification into quality-by-design paradigms becomes critical, transforming carbohydrate analysis from a post-hoc characterization step into a proactive control strategy to ensure batch consistency and optimize therapeutic performance.
Classification of glycans into groups that are distinguishable from one another on the basis of the chemical linkage type of the sugar residue to the protein core and the associated biosynthetic pathways provides the means to organize and make sense of the field of functional glycomics. Glycan classification schemes will play a crucial role in translating and interpreting the rapidly emerging and growing complexity of carbohydrate heterogeneity. The chemical classification of glycans into N-glycans, O-glycans, glycosaminoglycans, and GPI anchors provides a powerful organizational framework to facilitate understanding of the mechanistic basis of glycan-related disease pathology and drug targets. Glycan class-specific analytical platforms will become standard to more accurately measure glycan functional output, predict functional implications of the output, and point to new avenues for druggable glycan-associated vulnerability. Glycan classification into biosynthetic families and carbohydrate modifications will become integrated with additional 'omics' disciplines, such as transcriptomics and metabolomics, in order to more completely model the role of cellular regulation by glycosylation as a dynamic and sensitive sensor of metabolic homeostasis and not simply as a structural decorative post-translational modification. Glycan classification will become of increasing importance in clinical and translational settings, by serving as a quality-control check for biopharmaceutical products, becoming integrated into 'omics' data sets, becoming part of clinical interpretation tools, and serving as drug targets and biomarkers. Glycan classification tools and strategies are being developed and validated in collaboration with regulatory agencies to enable standardization of glycan reference materials, automation of class-specific analytical platforms, and education of clinical practitioners for the glycan-omics information that is becoming essential for interpretation of biological systems.
Enhance your understanding of N-glycans, O-glycans, and other complex glycan types with our advanced glycan profiling and structural analysis services. Using high-resolution LC-MS/MS, HILIC-HPLC, CE-MS, and exoglycosidase sequencing, we accurately identify, classify, and characterize glycan structures across diverse biological samples. Our solutions enable you to:
Whether you're studying glycan diversity, validating biopharmaceutical quality, or exploring glycan-function relationships, our glycan profiling and structural analysis solutions provide the precision and clarity required to advance your research.
References