A New Definition And Classification Of Antibody Complementarity Determining Regions: Unsupervised Learning Of Protein Backbone Conformations Informs Antibody Structural Bioinformatics And Design

Loading...
Thumbnail Image

Embargo Date

Degree type

Doctor of Philosophy (PhD)

Graduate group

Biochemistry & Molecular Biophysics

Discipline

Subject

Antibody Classification
Antibody Design
Backbone Conformation Clustering
SARS-COV-2
Structural Bioinformatics
Unsupervised Machine Learning
Atomic, Molecular and Optical Physics
Bioinformatics
Biophysics

Funder

Grant number

License

Copyright date

2022-09-09T20:21:00-07:00

Distributor

Related resources

Contributor

Abstract

One of the main challenges in modern molecular biology is to establish general, robust, and precise descriptions of the relationship between structural features of molecules (DNA, RNA, proteins, and glycans) and the sequence of their constituent chemical building blocks (nucleotides, amino acids, monosachharides). In his 1951 Nobel lecture, Linus Pauling predicted that chemistry of the future would rely upon these descriptions to solve problems in biological medicine relevant to human health. As of July 8, 2021, X-ray crystallography, NMR, and Cryo-EM have solved 179,842 molecular structures, which have been deposited in the Protein Data Bank (PDB) along with their associated sequences. Antibodies are the largest such family of deposited protein structures in the PDB, and their importance to human health and research in molecular biology is widely acknowledged. In this work, I first show the development and validation of unsupervised learning software to cluster protein backbone conformations (clustering of backbones for Ramachandran analysis, or COBRA). I then describe the application of this software to the wealth of antibody data in the PDB to provide a novel, electron density validated classification of the antibody complementarity determining regions (CDRs). I compare this new classification to previous classifications of the CDRs to show the improvement of the association between the sequences and structures of the CDRs, the ability to robustly separate various CDR families, and the ability to assess the confidence in the quality of CDR families using electron density as support. In addition to providing a new classification of the antibody CDRs by clustering their backbone conformations, I provide an expanded definition of the antibody binding region by defining, naming, and classifying an antibody V-region segment named the “DE loop”, which resembles the other six CDRs in sequence and structural variability, ability to bind antigen, and ability to stabilize antibodies, but has no current recognition as a canonical member of the CDRs. Finally, I show examples implementing these analyses in RosettaAntibodyDesign (RAbD) software to design antibodies towards SARS-COV-2 Spike Protein Type 1 (S1) Receptor Binding Domain (RBD), and show the experimental data for the generated antibody designs.

Date of degree

2021-01-01

Date Range for Data Collection (Start Date)

Date Range for Data Collection (End Date)

Digital Object Identifier

Series name and number

Volume number

Issue number

Publisher

Publisher DOI

Journal Issues

Comments

Recommended citation