End-to-end differentiable molecular mechanics force field construction

Yuanqing Wang, Josh Fass, and John D. Chodera
Chemical Science 13:12016, 2022 [DOI] [arXiv] [pytorch code] [JAX code]

Molecular mechanics force fields have been a workhorse for computational chemistry and drug discovery. Here, we propose a new approach to force field parameterization in which graph convolutional networks are used to perceive chemical environments and assign molecular mechanics (MM) force field parameters. The entire process of chemical perception and parameter assignment is differentiable end-to-end with respect to model parameters, allowing new force fields to be easily constructed from MM or QM force fields, extended, and applied to arbitrary biomolecules.

Teaching free energy calculations to learn from experimental data

Marcus Wieder, Josh Fass, and John Chodera
[bioRxiv] [code] [data]

We show, for the first time, how alchemical free energy calculations can be used to not only compute free energy differences between small molecules involving covalent bond rearrangements in systems treated entirely with quantum machine learning potentials, but that these calculations have the capacity to learn to efficiently generalize from conditioning on experimental free energy data.

The Open Force Field Evaluator: An automated, efficient, and scalable framework for the estimation of physical properties from molecular simulation

Simon Boothroyd, Lee-Ping Wang, David L. Mobley, John D. Chodera, and Michael R. Shirts

Preprint ahead of submission: [ChemRxiv]

We describe a new software framework for automated evaluation of physical properties for the benchmarking and optimization of small molecule force fields according to best practices.

Antibodies to the SARS-CoV-2 receptor-binding domain that maximize breadth and resistance to viral escape

Tyler N Starr, Nadine Czudnochowski, Fabrizia Zatta, Young-Jun Park, Zhuoming Liu, Amin Addetia, Dora Pinto, Martina Beltramello, Patrick Hernandez, Allison J Greaney, Roberta Marzi, William G Glass, Ivy Zhang, Adam S Dingens, John E Bowen, Jason A Wojcechowskyj, Anna De Marco, Laura E Rosen, Jiayi Zhou, Martin Montiel-Ruiz, Hannah Kaiser, Heather Tucker, Michael P Housley, Julia Di Iulio, Gloria Lombardo, Maria Agostini, Nicole Sprugasci, Katja Culap, Stefano Jaconi, Marcel Meury, Exequiel Dellota, Elisabetta Cameroni, Tristan I Croll, Jay C Nix, Colin Havenar-Daughton, Amalio Telenti, Florian A Lempp, Matteo Samuele Pizzuto, John D Chodera, Christy M Hebner, Sean PJ Whelan, Herbert W Virgin, David Veesler, Davide Corti, Jesse D Bloom, Gyorgy Snell
Nature, in press. [DOI] [bioRxiv] [GitHub]

We comprehensively characterize escape, breadth, and potency across a panel of SARS-CoV-2 antibodies targeting the receptor binding domain, including the parent antibody of the recently approved Vir antibody drug (Sotrovimab), illuminating escape mutations with structural and dynamic insight into their mechanism of action.

Best practices for alchemical free energy calculations

Mey ASJS, Allen B, Bruce Macdonald HE, Chodera JD, Kuhn M, Michel J, Mobley DL, Naden LN, Prasad S, Rizzi A, Scheen J, Shirts MR, Tresadern G, and Xu H.
Living Journal of Computational Molecular Sciences 2022 [DOI]
[arXiv] [GitHub]

This living review for the Living Journal of Computational Molecular Sciences (LiveCoMS) covers the essential considerations for running alchemical free energy calculations for rational molecular design for drug discovery.

Towards chemical accuracy for alchemical free energy calculations with hybrid physics-based machine learning / molecular mechanics potentials

Dominic A. Rufa, Hannah E. Bruce Macdonald, Josh Fass, Marcus Wieder, Patrick B. Grinaway, Adrian E. Roitberg, Olexandr Isayev, and John D. Chodera.
Preprint ahead of submission.
[bioRxiv] [GitHub]

In this first use of hybrid machine learning / molecular mechanics (ML/MM) potentials for alchemical free energy calculations, we demonstrate how the improved modeling of intramolecular ligand energetics offered by the quantum machine learning potential ANI-2x can significantly improve the accuracy in predicting kinase inhibitor binding free energy by reducing the error from 0.97~kcal/mol to 0.47~kcal/mol, which could drastically reduce the number of compounds that must be synthesized in lead optimization campaigns for minimal additional computational cost.

Is structure based drug design ready for selectivity optimization?

Steven K. Albanese, John D. Chodera, Andrea Volkamer, Simon Keng, Robert Abel, and Lingle Wang
Journal of Chemical Informatics and Modeling [DOI] [bioRxiv] [GitHub]

We asked whether the similarity of binding sites in related kinases might result in a fortuitous cancellation of errors in using alchemical free energy calculations to predict kinase inhibitor selectivities. Surprisingly, we find that even distantly related kinases have sufficient correlation in their errors that predicting changes in selectivity can be much more accurate than predicting changes in potency due to this effect, and show how this could lead to large reductions in the number of molecules that must be synthesized to achieve a desired selectivity goal.

Standard state free energies, not pKas, are ideal for describing small molecule protonation and tautomeric states

M R Gunner, Taichi Murakami, Ariën S. Rustenburg, Mehtap Işık, and John D. Chodera.
Journal of Computer Aided Molecular Design 34:561, 2020. [DOI] [PDF] [GitHub]

Here, we demonstrate how the physical nature of protonation and tautomeric state effects means that the standard state free energies of each microscopic protonation/tautomeric state at a single pH is sufficient to describe the complete pH-dependent microscopic and macroscopic populations. We introduce a new kind of diagram that uses this concept to illustrate a variety of pH-dependent phenomena, and show how it can be used to identify common issues with protonation state prediction algorithms. As a result, we recommend future blind prediction challenges utilize microstate free energies at a single reference pH as the minimal sufficient information for assessing prediction accuracy and utility.

Assessing the accuracy of octanol-water partition coefficient predictions in the SAMPL6 Part II log P Challenge

Mehtap Işık, Teresa Danielle Bergazin, Thomas Fox,  Andrea Rizzi, John D. Chodera, and David L. Mobley.
Journal of Computer Aided Molecular Design, 34:335, 2020. [DOI] [PDF] [bioRxiv] [GitHub]

We report the performance assessment of the 91 methods that were submitted to the SAMPL6 blind challenge for predicting octanol-water partition coefficient (logP) measurements. The average RMSE of the most accurate five MM-based, QM-based, empirical, and mixed approach methods based on RMSE were 0.92±0.13, 0.48±0.06, 0.47±0.05, and 0.50±0.06, respectively.

The SAMPL6 SAMPLing challenge: Assessing the reliability and efficiency of binding free energy calculations

Andrea Rizzi, Travis Jensen, David R. Slochower, Matteo Aldeghi, Vytautas Gapsys, Dimitris Ntekoumes, Stefano Bosisio, Michail Papadourakis, Niel M. Henriksen, Bert L. de Groot, Zoe Cournia, Alex Dickson, Julien Michel, Michael K. Gilson, Michael R. Shirts, David L. Mobley, and John D. Chodera
Journal of Computer Aided Molecular Design 34:601, 2020. [DOI] [PDF] [bioRxiv] [GitHub]

To assess the relative efficiencies of alchemical binding free energy calculations, the SAMPL6 SAMPLing challenge asked participants to submit predictions as a function of computer effort for the same force field and charge model. Surprisingly, we found that most molecular simulation codes cannot agree on the binding free energy was, even for the same force field.

Binding thermodynamics of host-guest systems with SMIRNOFF99Frosst 1.0.5 from the Open Force Field Initiative

David R. Slochower, Neil M. Hendriksen, Lee-Ping Wang, John D. Chodera, David L. Mobley, and Michael K. Gilson.
Journal of Chemical Theory and Computation ASAP. [DOI] [bioRxiv] [GitHub]

We assess the accuracy of the SMIRNOFF99Frosst 1.0.5 force field in reproducing host-guest binding thermodynamics in comparison with the GAFF force field, demonstrating how the SMIRNOFF format for compactly specifying force fields provide comparable accuracy with 20x fewer parameters.

The dynamic conformational landscapes of the protein methyltransferase SETD8

SETD8-landscape.png

Rafal P. Wiewiora*, Shi Chen*, Fanwang Meng, Nicolas Babault, Anqi Ma, Wenyu Yu, Kun Qian, Hao Hu, Hua Zou, Junyi Wang, Shijie Fan, Gil Blum, Fabio Pittella-Silva, Kyle A. Beauchamp, Wolfram Tempel, Hualing Jiang, Kaixian Chen, Robert Skene, Y. George Zheng, Peter J. Brown, Jian Jin, John D. Chodera+, and Minkui Luo+.
eLife 8:e45403, 2019. [DOI] [bioRxiv] [GitHub] [OSF] [movies] [MSKCC blog post]
* These authors contributed equally to this work
+ Co-corresponding authors

In this work, we show how targeted X-ray crystallography using covalent inhibitors and depletion of native ligands to reveal structures of low-population hidden conformations can be combined with massively distributed molecular simulation to resolve the functional dynamic landscape of the protein methyltransferase SETD8 in unprecedented atomistic detail. Using an aggregate of six milliseconds of fully atomistic simulation from Folding@home, we use Markov state models to illuminate the conformational dynamics of this important epigenetic protein.

All Folding@home simulation trajectories for this paper are available on the Open Science Framework.

The trajectories generated for this project were used as the source for a unique musical composition 'Metastable' by George Holloway, performed by the Ligeti String Quartet with visual accompaniment from Robert Arbon.

OpenPathSampling: A Python framework for path sampling simulations. II. Building and customizing path ensembles and sample schemes

David W.H. Swenson, Jan-Hendrik Prinz, Frank Noé, John D. Chodera, Peter G. Bolhuis
Journal of Chemical Theory and Computation 15:837, 2019. [DOI] [bioRxiv] [PDF] [GitHub] [openpathsampling.org]

To make powerful path sampling techniques broadly accessible and efficient, we have produced a new Python framework for easily implementing path sampling strategies (such as transition path and interface sampling) in Python. This second publication describes advanced aspects of the theory and details of how to customize path ensembles.

OpenPathSampling: A Python framework for path sampling simulations. I. Basics

David W.H. Swenson, Jan-Hendrik Prinz, Frank Noé, John D. Chodera, Peter G. Bolhuis
Journal of Chemical Theory and Computation 15:813, 2019 [DOI] [bioRxiv] [PDF] [GitHub] [openpathsampling.org]

To make powerful path sampling techniques broadly accessible and efficient, we have produced a new Python framework for easily implementing path sampling strategies (such as transition path and interface sampling) in Python. This first publication describes some of the theory and capabilities behind the approach.

An open library of human kinase domain constructs for automated bacterial expression

kinome-expression-tree.jpg

Steven K. Albanese*, Daniel L. Parton*, Mehtap Isik**, Lucelenie Rodríguez-Laureano**, Sonya M. Hanson,  Julie M. Behr, Scott Gradia, Chris Jeans, Nicholas M. Levinson, Markus A. Seeliger, and John D. Chodera.
* co-first author; ** co-second author
Biochemistry 57:4675, 2018. [DOI] [PDF] [bioRxiv] [GitHub]
Interactive data browser: [github.io]
Plasmids available via AddGene

Human kinase catalytic domains---the therapeutic target of selective kinase inhibitors used in the treatment of cancer and other diseases---are notoriously difficult and expensive to express in insect or human cells. Here, we utilize the phosphatase co-expression technology developed by Markus Seeliger (now at Stony Brook) to develop a library of human kinase catalytic domains for facile and inexpensive expression in bacteria.

pKa measurements for the SAMPL6 prediction challenge for a set of kinase inhibitor-like fragments

Mehtap Işık, Dorothy Levorse, Ariën S. Rustenburg, Ikenna E. Ndukwe, Heather Wang , Xiao Wang , Mikhail Reibarkh , Gary E. Martin , Alexey A. Makarov , David L. Mobley, Timothy Rhodes*, John D. Chodera*.
* co-corresponding authors
Journal of Computer-Aided Molecular Design special issue on SAMPL6 32:1117, 2018.
[DOI] [PDF] [bioRxiv] [Supplementary Tables and Figures] [Supplementary Data (includes Sirius T3 reports on all measurements)]

The SAMPL5 blind challenge exercises identified neglect of protonation state effects as a major accuracy-limiting factor in physical modeling of biomolecular interactions. In this study, we report the experimental measurements behind a SAMPL6 blind challenges in which we assess the ability of community codes to predict small molecule pKas for small molecule resembling fragments of selective kinase inhibitors.

Predicting resistance of clinical Abl mutations to targeted kinase inhibitors using alchemical free-energy calculations

Kevin Hauser, Christopher Negron, Steven K. Albanese, Soumya Ray, Thomas Steinbrecher, Robert Abel, John D. Chodera, and Lingle Wang.
Communications Biology 1:70, 2018 [DOI] [PDF] [input files and analysis scripts]

In our first collaborative paper with Schrödinger, we present the first comprehensive benchmark assessing the ability for alchemical free energy calculations to predict clinical mutational resistance or susceptibility to targeted kinase inhibitors using the well-studied kinase Abl, the target of therapy for chronic myelogenous leukemia (CML).

Quantifying configuration-sampling error in Langevin simulations of complex molecular systems

quantifying-langevin-error.jpg

Josh Fass, David Sivak , Gavin E. Crooks, Kyle A. Beauchamp, Benedict Leimkuhler, and John Chodera.
Entropy 20:318, 2018. [DOI] [PDF] [GitHub] [bioRxiv preprint]

Molecular dynamics simulations necessarily use a finite timestep, which introduces error or bias in the sampled configuration space density that grows rapidly with increasing timestep. For the first time, we show how to compute a natural measure of this error---the KL divergence---in both phase and configuration space for a widely used family of Langevin integrators, and show that VRORV is generally superior for simulation of molecular systems.

A dynamic mechanism for allosteric activation of Aurora kinase A by activation loop phosphorylation

Emily F. Ruff, Joseph M. Muretta, Andrew Thompson, Eric W. Lake, Soreen Cyphers, Steven K. Albanese, Sonya M. Hanson, Julie M. Behr, David D. Thomas,  John D. Chodera, and Nicholas M. Levinson. 
eLife 7:e32766, 2018. [DOI] [bioRxiv]

We show that, contrary to the canonical belief that activation shifts DFG-out to DFG-in populations, phosphorylation of AurA does not shift DFG-in/out equilibrium but instead remodels the conformational distribution of the DFG-in state.