Mohammed AlQuraishi, assistant professor at Columbia University, will give an Apex Lecture on Monday, Jan. 29, at 12:00 p.m. CT in 214 Light Hall.
His talk, “The State of Protein Structure Prediction and Friends,” is hosted by the School of Medicine Basic Sciences and co-sponsored by the Center for Applied AI in Protein Dynamics.
AlQuraishi is a member of the Department of Systems Biology and Columbia’s Program for Mathematical Genomics, where he works at the intersection of machine learning, biophysics, and systems biology. He earned an MS in statistics and a PhD in genetics from Stanford University and before joining the Systems Biology Department at Harvard Medical School as a Departmental Fellow and a Fellow in Systems Pharmacology where he developed the first end-to-end differentiable model for learning protein structure from data. Prior to starting his academic career, Dr. AlQuraishi spent three years founding two startups in the mobile computing space. He joined the Columbia Faculty in 2020.
AlQuraishi recently received a U.S. Department of Energy’s Office of Science INCITE (Innovative and Novel Computational Impact on Theory and Experiment) 2024 award for his work on OpenFold-powered machine learning of protein-protein interactions and complexes. In his upcoming Apex Lecture AlQuraishi will discuss OpenFold, an optimized and trainable version of AlphaFold2.
“Our research program sits at the intersection of structural and systems biology,” he notes, “where we aim to build models of biomolecules and their interactions that are sufficiently performant for systems phenomena to emerge from molecular interactions. To accomplish this, we use computer science to study biology in two fundamentally different ways: as a tool to analyze and model biological data, where machine learning is a particularly powerful hammer, and as a conceptual framework for reasoning about and formalizing biological phenomena, where programming languages and computer programs serve as useful analogs. To us, the former is bioinformatics; the latter computational and systems biology.
Our focus is computational but we host and collaborate with experimental colleagues. We also borrow tools from established fields of quantitative modeling, including control theory and dynamical systems, as no one discipline has a monopoly on good ideas.”
About The Apex Lecture Series
There are major inflection points in biomedical discovery that create new fields, new ideas, and new opportunities to impact human health. To engage with global researchers contributing to these inflection points, the Vanderbilt School of Medicine Basic Sciences launched the Apex Lecture Series in 2023. This school-wide seminar series brings scientists who are influencing the trajectory of their fields to engage with our scientific community on campus.
Lecture Abstract
AlphaFold2 revolutionized structural biology by accurately predicting protein structures from sequences. Its implementation, however, (i) lacks the code and data required to train models for new tasks, such as predicting alternate protein conformations or antibody structures, (ii) is unoptimized for commercially available computing hardware, making large-scale prediction campaigns impractical, and (iii) remains poorly understood with respect to how training data and regimen influence accuracy.
Here we report OpenFold, an optimized and trainable version of AlphaFold2. We train OpenFold from scratch and demonstrate that it fully reproduces AlphaFold2’s accuracy. By analyzing OpenFold training, we find new relationships between data size/diversity and prediction accuracy and gain insights into how OpenFold learns to fold proteins during its training process.