Effect of Codons and Secondary Structure on mRNA Protein Expression Levels

mRNA technologies

mRNA serves as templates for creating proteins, and the quantity of protein generated from each mRNA molecule is contingent on how the translation apparatus commences or elongates the coding sequence (CDS). Additionally, it relies on the functional half-life of the mRNA, which represents the duration during which mRNA actively contributes to protein synthesis. It’s important to note that this functional half-life may differ from the physical degradation half-life. Both translation efficiency and functional half-life are dictated by the mRNA’s primary sequence. The choice of synonymous codons significantly influences mRNA translation. Frequently, highly expressed mRNAs feature more optimal codons, while the use of non-optimal codons can increase the likelihood of ribosomal pausing and reduce mRNA half-life. Various sequence attributes, such as dinucleotide frequency and the influence of codon order on local tRNA interactions, have also been recurrently linked to protein expression. Given the intertwined nature of these factors with the mRNA’s primary sequence, disentangling their individual impacts on mRNA protein expression proves challenging, resulting in conflicting interpretations in existing findings.

An RNA secondary structure can be decomposed into several types of nearest-neighbor loops
Figure 1. An RNA secondary structure can be decomposed into several types of nearest-neighbor loops. (K, Sato.; et al, 2021)

Apart from determining the amino acid sequence responsible for encoding proteins, the primary sequence of mRNA plays a crucial role in shaping both the secondary and tertiary structures of mRNA. The relationship between different structural regions of mRNA and protein expression is noteworthy. Altering the secondary structure of mRNA can be achieved through two methods: the first involves direct modifications to the primary sequence. However, changes in the primary sequence within the coding sequence (CDS) inevitably result in alterations to codons, complicating the evaluation of the independent effects of codon and mRNA structural changes on protein expression. The second method aims to maintain unchanged codons, keeping the sequence constant while introducing modified nucleotides to uphold the same base pairing. Nevertheless, this approach can influence the local secondary structure of mRNA. These modified nucleotides may either stabilize or destabilize base pairing, impacting the overall structure of mRNA.

The Impact of CDS Changes on mRNA Protein Expression

For mRNA sequences encoding the same protein, keeping the 5’/3’UTR sequences and 100nt Poly A identical and optimizing the CDS region through codon optimization yields a set of mRNA sequences encoding the same protein. Using naturally occurring uridine (U) nucleotides to synthesize these mRNAs and transfecting cells, noticeable differences in protein expression levels are observed among different CDS sequences. The greater the disparity in codon adaptability, the more pronounced the differences in protein expression among different mRNAs. mRNA sequences with high expression levels often have a higher GC content, but not all mRNAs with high GC content are highly expressed.

The degenerate genetic code
Figure 2. The degenerate genetic code. (V, P, Mauro.; et al, 2023)

In a set of different mRNA sequences encoding the same protein, each mRNA sequence uses unique codons to encode specific amino acids. These diverse mRNA sequences exhibit significant differences in protein expression levels within cells. However, when comparing the protein expression levels of mRNAs containing synonymous codon pairs, an unexpected result is found: even when the adaptability of synonymous codons is poor, their substitution generally does not significantly affect mRNA protein expression.

The Effect of Modified Nucleotides on mRNA Protein Expression

In contrast to mRNAs containing U, the substitution of these with modified nucleotides has a substantial impact on mRNA protein expression levels. The influence of various modified nucleotides on mRNA protein expression is highly variable: in certain mRNA sequences, the use of modified nucleotides significantly boosts protein expression, while in others, it leads to a reduction in protein expression. Across the majority of mRNA sequences, specific modified nucleotides notably enhance protein expression levels. When considering a group of mRNA sequences that encode the same protein but possess different CDS variations, the mRNA sequence exhibiting the highest and lowest protein expression is inconsistent when employing different modified nucleotides.

The impact of modified nucleotides on mRNA protein expression levels naturally prompts the question of whether this influence stems directly from altering the ribosomal decoding process through the incorporation of modified nucleotides. If this were the case, one would expect a correlation between mRNA protein expression levels and either the overall content of modified nucleotides or specific codons containing modified nucleotides. However, in reality, there is no evident connection between mRNA protein expression levels and the percentage of U in the sequence. While a few codons with modified nucleotides can significantly affect mRNA protein expression levels, unfortunately, no universal pattern has been discerned, indicating that CDS variations and modified nucleotides exert diverse influences on mRNA protein expression levels.

Key components of in vitro transcribed mRNA that determine the level and duration of expression of the encoded protein
Figure 3. Key components of in vitro transcribed mRNA that determine the level and duration of expression of the encoded protein. (A, Esprit.; et al, 2020)

The Effect of mRNA Secondary Structure on mRNA Protein Expression

For the same sequence, m1Ψ (N1-methylpseudouridine) typically results in higher expression compared to U (uridine) or mo5U (5-methoxyuridine). Biophysical studies indicate that, in comparison to U, m1Ψ and mo5U (stabilizing and destabilizing, respectively) have significantly different and opposing effects on the overall mRNA folding, nearest-neighbor base pairing, and secondary structure. In other words, the incorporation of modified nucleotides into the mRNA sequence leads to changes in mRNA secondary structure.

The location of secondary structures in the mRNA sequence has a unique impact on protein expression. The lower the secondary structure in the 5’UTR sequence and the first 30 nucleotides of the CDS, the higher the mRNA protein expression level. However, unexpectedly, the increase in secondary structure downstream of the first 30 nucleotides in the CDS region and in the 3’UTR sequence is associated with an elevation in protein expression levels. In comparison to mRNAs with a moderate level of secondary structure, highly structured mRNAs exhibit an increased functional half-life, whether their codon adaptability is moderate or optimal. In other words, secondary structure can enhance protein output by extending the mRNA’s functional half-life, serving as an independent regulatory mechanism irrespective of codon adaptability.

Regulatory effects of RNA secondary structures over splicing and translation
Figure 4. Regulatory effects of RNA secondary structures over splicing and translation. (I, G, Soares.; et al, 2022)

References

1. K, Sato.; et al. RNA Secondary Structure Prediction Using Deep Learning with Thermodynamic Integration. Nature Communications. 2021, 12: 941.

2. V, P, Mauro.; et al. A Critical Analysis of Codon Optimization in Human Therapeutics. Trends in Molecular Medicine. 2014, 20(11): 604–613.

3. A, Esprit.; et al. Neo-Antigen mRNA Vaccines. Vaccines. 2020, 8(4): 776.

4. I, G, Soares.; et al. Secondary Structures in RNA Synthesis, Splicing And Translation. Computational and Structural Biotechnology Journal. 2022, 20: 2871-2884.