The history DNA-Encoded Chemical Libraries

Ligand discovery: a central problem in chemistry, biology, and the biomedical sciences

The discovery of specific ligands that bind to protein targets of interest represents an activity of fundamental importance both for basic research and for industrial applications. Many drug discovery programs start with the search for a molecule that interacts with a validated protein target. Similarly, the deciphering of complex biochemical processes in basic research often relies on the availability of specific reagents, which bind (or even block) macromolecular structures of interest, thus allowing their visualization, quantification, or functional investigation.

The value of small organic ligands capable of high-affinity binding to a cognate protein is illustrated by the versatility of the biotin-(strept)avidin system. Derivatives of biotin are recognized by avidin or streptavidin with dissociation constants in the sub-picomolar range. These tight-binding complexes have found numerous applications not only for the development of specific reagents in biochemistry and immunology but also in areas as diverse as chemical synthesis, nuclear medicine, and material sciences, to name just a few.

For many years, the discovery of small organic ligands to protein targets has been performed by screening very large sets of organic molecules (termed chemical libraries), one by one. Large pharmaceutical companies typically construct and assay chemical libraries comprising 1 million organic molecules or more, using high-throughput screening procedures. While the value of high-throughput library screening has been demonstrated in various pharmaceutical applications, it is not uncommon that binding molecules of sufficient affinity and specificity be discovered using conventional screening campaigns.

In light of these considerations, significant efforts have been devoted and continue to be devoted to the discovery and development of methods that facilitate the identification of specific binding molecules to macromolecular targets and proteins in particular. DNA-encoded chemical library technology enables the construction and screening of compound sets of unprecedented size and, as a consequence, the discovery of small organic ligands. When the size of a library grows, the concentration of individual library members decreases, to an extent that those molecules may no longer nbe detectable even with the most sophisticated analytical methods. However, DNA tags allow the amplification, identification, and relative quantification of molecules in very large libraries.

From encoded libraries of polypeptides to DNA-encoded chemical libraries

The advent of encoded combinatorial libraries of polypeptides not only has played an important role for the engineering of proteins with novel properties, with applications in many research fields, but also has been conceptually instrumental for the genesis of DNA-encoded chemical libraries. For this reason, it is convenient to briefly discuss a few milestones in this research area.

In 1985, George P. Smith proposed the use of filamentous phage as tools for the display of polypeptides on the surface of these bacterial viruses. In a popular implementation, peptides or proteins would be genetically fused at the N-terminal end of the minor coat protein pIII of filamentous phage. The resulting viral particle features a potential functional property (e.g., a binding phenotype, embodied by the polypeptide on the phage surface), while simultaneously bearing the corresponding genetic information (i.e., genotype) as part of the modified phage genome (Figure 1). Soon afterward, however, the groups of Sir Gregory Winter realized that very large combinatorial libraries of antibodies, rather than of short peptides, could be created and that phage display could be used to amplify and isolate rare binding specificities within those libraries.


Schematic representation of antibody phage display libraries and of DNA-encoded chemical libraries.

Figure 1: Schematic representation of antibody phage display libraries and of DNA-encoded chemical libraries.


In 1992, one of the authors (R.A.L.) together with Sydney Brenner postulated that it should be possible to encode chemistry with DNA. The authors envisaged the possibility of simultaneously synthesizing distinctive polypeptide and oligonucleotide sequences on beads, using orthogonal chemistry and split-and-pool procedures. The synthesis of oligonucleotides on beads would not be limited to the stepwise assembly of bases, since other assembly strategies (e.g., the stepwise ligation of DNA fragments) could also be considered. The bar code would only act as identifier for the corresponding peptide structure and would not act as genotype in a biosynthetic sense. Indeed, the original article anticipated that the bead could be replaced by a generic chemical linker and the connection between binding phenotype and the corresponding bar code would be preserved.

Shortly afterward, Needels and colleagues, as well as Janda and colleagues, exemplified the Brenner and Lerner concept with the synthesis of peptide libraries, successfully using bead-based libraries to retrieve known antibody epitopes. In 2004, three groups reported on the construction of DNA-encoded combinatorial libraries of organic molecules and on the selection of specific binders, using libraries devoid of beads and affinity-capture procedures, which are similar to the ones previously used for the panning of phage display libraries. Several DNA-encoded chemical libraries were subsequently synthesized and screened, as we see in the following sections.


Neri, D., & Lerner, R. A. (2018). DNA-Encoded Chemical Libraries: A Selection System Based On Endowing Organic Compounds with Amplifiable Information. Annual review of biochemistry, (0).