Towards chemical accuracy for alchemical free energy calculations with hybrid physics-based machine learning / molecular mechanics potentials

Dominic A. Rufa, Hannah E. Bruce Macdonald, Josh Fass, Marcus Wieder, Patrick B. Grinaway, Adrian E. Roitberg, Olexandr Isayev, and John D. Chodera.
Preprint ahead of submission.
[bioRxiv] [GitHub]

In this first use of hybrid machine learning / molecular mechanics (ML/MM) potentials for alchemical free energy calculations, we demonstrate how the improved modeling of intramolecular ligand energetics offered by the quantum machine learning potential ANI-2x can significantly improve the accuracy in predicting kinase inhibitor binding free energy by reducing the error from 0.97~kcal/mol to 0.47~kcal/mol, which could drastically reduce the number of compounds that must be synthesized in lead optimization campaigns for minimal additional computational cost.

Is structure based drug design ready for selectivity optimization?

Steven K. Albanese, John D. Chodera, Andrea Volkamer, Simon Keng, Robert Abel, and Lingle Wang
Journal of Chemical Informatics and Modeling [DOI] [bioRxiv] [GitHub]

We asked whether the similarity of binding sites in related kinases might result in a fortuitous cancellation of errors in using alchemical free energy calculations to predict kinase inhibitor selectivities. Surprisingly, we find that even distantly related kinases have sufficient correlation in their errors that predicting changes in selectivity can be much more accurate than predicting changes in potency due to this effect, and show how this could lead to large reductions in the number of molecules that must be synthesized to achieve a desired selectivity goal.

Machine learning force fields and coarse-grained variables in molecular dynamics: application to materials and biological systems

Gkeka P, Stoltz G, Farimani AB, Belkacemi Z, Ceriotti M, Chodera JD, Dinner AR, Ferguson A, Maillet JB, Minoux H, Peter C, Pietrucci F, Silveira A, Tkatchenko A, Trstanova Z, Wiewiora R, Leliévre T.
Journal of Chemical Theory and Computation 60:6211, 2020. [DOI] [arXiv]

We review the state of the art in applying machine learning to coarse grain force fields in space and time to study mutliscale dynamics.

Octanol-water partition coefficient measurements for the SAMPL6 Blind Prediction Challenge

sampl6-part2-logP.png

Mehtap Işık, Dorothy Levorse, David L. Mobley, Timothy Rhodes, and John D. Chodera.
Journal of Computer Aided Molecular Design
34:405, 2020. [DOI] [bioRxiv] [data] [GitHub]

We describe the design and data collection (and associated challenges) for the SAMPL6 part II logP octanol-water blind prediction challenge, where the goal was to benchmark the accuracy of force fields for druglike molecules (here, molecules resembling kinase inhibitors).

A small-molecule pan-Id antagonist inhibits pathologic ocular neovascularization

agx51.png

Paulina M. Wojnarowicz, Raquel Lima e Silva, Masayuki Ohanka, Sang Bae Lee, Yvette Chin, Anita Kulukian, Sung-Hee Chang, Bina Desai, Marta Garcia Escolano, Riddhi Shah, Marta Garcia-Cao, Sijia Xu, Rashmi Thakar, Yehuda Goldgur, Meredith A. Miller, Ouathek Ouerfelli, Guangli Yang, Tsutomu Arakawa, Steven K. Albanese, William A. Garland, Glenn Stoller, Jaideep Chaudhary, Rajesh Soni, John Philip, Ronald C. Hendrickson, Antonio Iavarone, Andrew J. Dannenberg, John D. Chodera, Nikola Pavletich, Anna Lasorella, Peter A. Campochiaro, and Robert Benezra
Cell Reports 29:62, 2019 [DOI] [PDF]

We report the discovery and characterization of a small molecule, AGX51, with the surprising ability to inhibit the interaction of Id1 with E47, which leads to ubiquitin-mediated degradation of Ids.

Graph nets for partial charge prediction

Yuanqing Wang, Josh Fass, Chaya D. Stern, Kun Luo, and John D. Chodera.
Preprint ahead of publication.
[arXiv] [GitHub]

Graph convolutional and message-passing networks can be a powerful tool for predicting physical properties of small molecules when coupled to a simple physical model that encodes the relevant invariances. Here, we show the ability of graph nets to predict partial atomic charges for use in molecular dynamics simulations and physical docking.

Sharing data from molecular simulations

Abraham MJ, Apostolov R, Barnoud J, Bauer P, Blau C, Bonvin AMJJ, Chavent M, Chodera JD, Condic-Jurkic K, Delemotte L, Grubmüller H, Howard RJ, Lindahl E, Ollila S, Salent J, Smith D, Stansfeld PJ, Tiemann J, Trellet M, Woods C, and Zhmurov A.
Journal of Chemical Information and Modeling ASAP. [chemRxiv] [DOI] [PDF]

There is a dire need to establish standards for sharing data in the molecular sciences. Here, we review the findings of a workshop held in Stockholm in Nov 2018 to discuss this need.

Binding thermodynamics of host-guest systems with SMIRNOFF99Frosst 1.0.5 from the Open Force Field Initiative

David R. Slochower, Neil M. Hendriksen, Lee-Ping Wang, John D. Chodera, David L. Mobley, and Michael K. Gilson.
Journal of Chemical Theory and Computation ASAP. [DOI] [bioRxiv] [GitHub]

We assess the accuracy of the SMIRNOFF99Frosst 1.0.5 force field in reproducing host-guest binding thermodynamics in comparison with the GAFF force field, demonstrating how the SMIRNOFF format for compactly specifying force fields provide comparable accuracy with 20x fewer parameters.

OpenPathSampling: A Python framework for path sampling simulations. II. Building and customizing path ensembles and sample schemes

David W.H. Swenson, Jan-Hendrik Prinz, Frank Noé, John D. Chodera, Peter G. Bolhuis
Journal of Chemical Theory and Computation 15:837, 2019. [DOI] [bioRxiv] [PDF] [GitHub] [openpathsampling.org]

To make powerful path sampling techniques broadly accessible and efficient, we have produced a new Python framework for easily implementing path sampling strategies (such as transition path and interface sampling) in Python. This second publication describes advanced aspects of the theory and details of how to customize path ensembles.

OpenPathSampling: A Python framework for path sampling simulations. I. Basics

David W.H. Swenson, Jan-Hendrik Prinz, Frank Noé, John D. Chodera, Peter G. Bolhuis
Journal of Chemical Theory and Computation 15:813, 2019 [DOI] [bioRxiv] [PDF] [GitHub] [openpathsampling.org]

To make powerful path sampling techniques broadly accessible and efficient, we have produced a new Python framework for easily implementing path sampling strategies (such as transition path and interface sampling) in Python. This first publication describes some of the theory and capabilities behind the approach.

Toward learned chemical perception of force field typing rules

Camila Zanette, Caitlin C. Bannan, Christopher I. Bayly, Josh Fass, Michael K. Gilson, Michael R. Shirts, John Chodera, and David L. Mobley
Journal of Chemical Theory and Computation, 15:402, 2019. [DOI] [ChemRxiv] [GitHub]

We show how machine learning can learn typing rules for molecular mechanics force fields within a Bayesian statistical framework.

Overview of the SAMPL6 host-guest binding affinity prediction challenge

Andrea RizziSteven MurkliJohn N. McNeillWei YaoMatthew SullivanMichael K. Gilson, Michael W. Chiu, Lyle IsaacsBruce C. GibbDavid L. Mobley*, John D. Chodera*
* denotes co-corresponding authors
Journal of Computer-Aided Molecular Design special issue on SAMPL6, 32:937, 2018. [DOI] [bioRxiv] [GitHub]

We present an overview of the host-guest systems and participant performance for the SAMPL6 host-guest blind affinity prediction challenges, assessing how well various physical modeling approaches were able to predict ligand binding affinities for simple ligand recognition problems where receptor sampling and protonation state effects are eliminated due to the simplicity of supramolecular hosts. We find that progress is now stagnated likely due to force field limitations.

An open library of human kinase domain constructs for automated bacterial expression

kinome-expression-tree.jpg

Steven K. Albanese*, Daniel L. Parton*, Mehtap Isik**, Lucelenie Rodríguez-Laureano**, Sonya M. Hanson,  Julie M. Behr, Scott Gradia, Chris Jeans, Nicholas M. Levinson, Markus A. Seeliger, and John D. Chodera.
* co-first author; ** co-second author
Biochemistry 57:4675, 2018. [DOI] [PDF] [bioRxiv] [GitHub]
Interactive data browser: [github.io]
Plasmids available via AddGene

Human kinase catalytic domains---the therapeutic target of selective kinase inhibitors used in the treatment of cancer and other diseases---are notoriously difficult and expensive to express in insect or human cells. Here, we utilize the phosphatase co-expression technology developed by Markus Seeliger (now at Stony Brook) to develop a library of human kinase catalytic domains for facile and inexpensive expression in bacteria.

pKa measurements for the SAMPL6 prediction challenge for a set of kinase inhibitor-like fragments

Mehtap Işık, Dorothy Levorse, Ariën S. Rustenburg, Ikenna E. Ndukwe, Heather Wang , Xiao Wang , Mikhail Reibarkh , Gary E. Martin , Alexey A. Makarov , David L. Mobley, Timothy Rhodes*, John D. Chodera*.
* co-corresponding authors
Journal of Computer-Aided Molecular Design special issue on SAMPL6 32:1117, 2018.
[DOI] [PDF] [bioRxiv] [Supplementary Tables and Figures] [Supplementary Data (includes Sirius T3 reports on all measurements)]

The SAMPL5 blind challenge exercises identified neglect of protonation state effects as a major accuracy-limiting factor in physical modeling of biomolecular interactions. In this study, we report the experimental measurements behind a SAMPL6 blind challenges in which we assess the ability of community codes to predict small molecule pKas for small molecule resembling fragments of selective kinase inhibitors.

Predicting resistance of clinical Abl mutations to targeted kinase inhibitors using alchemical free-energy calculations

Kevin Hauser, Christopher Negron, Steven K. Albanese, Soumya Ray, Thomas Steinbrecher, Robert Abel, John D. Chodera, and Lingle Wang.
Communications Biology 1:70, 2018 [DOI] [PDF] [input files and analysis scripts]

In our first collaborative paper with Schrödinger, we present the first comprehensive benchmark assessing the ability for alchemical free energy calculations to predict clinical mutational resistance or susceptibility to targeted kinase inhibitors using the well-studied kinase Abl, the target of therapy for chronic myelogenous leukemia (CML).

Bayesian analysis of isothermal titration calorimetry for binding thermodynamics

Trung Hai Nguyen, Arien S. Rustenburg, Stefan G. Krimmer, Hexi Zhang, John D. Clark, Paul A. Novick, Kim Branson, Vijay S. Pande, John D Chodera, David D. L. Minh.
PLoS One 13:e0203224, 2018[DOI] [bioRxiv] [GitHub]

We show how Bayesian inference can produce greatly improved estimates of statistical uncertainty from isothermal titration calorimetry (ITC) experiments, allowing the joint distribution of thermodynamic parameter uncertainties to be inferred.

Quantifying configuration-sampling error in Langevin simulations of complex molecular systems

quantifying-langevin-error.jpg

Josh Fass, David Sivak , Gavin E. Crooks, Kyle A. Beauchamp, Benedict Leimkuhler, and John Chodera.
Entropy 20:318, 2018. [DOI] [PDF] [GitHub] [bioRxiv preprint]

Molecular dynamics simulations necessarily use a finite timestep, which introduces error or bias in the sampled configuration space density that grows rapidly with increasing timestep. For the first time, we show how to compute a natural measure of this error---the KL divergence---in both phase and configuration space for a widely used family of Langevin integrators, and show that VRORV is generally superior for simulation of molecular systems.

Escaping atom types in force fields using direct chemical perception

David L. Mobley, Caitlin C. Bannan, Andrea Rizzi, Christopher I. Bayly, John D. Chodera, Victoria T Lim, Nathan M. Lim, Kyle A. Beauchamp, Michael R. Shirts, Michael K. Gilson, Peter K. Eastman.
Journal of Chemical Theory and Computation 14:6076, 2018 [DOI] [bioRxiv]

We describe the philosophy behind a modern approach to molecular mechanics forcefield parameterization, and present initial results for the first SMIRNOFF-encoded forcefield: SMIRNOFF99Frosst.

A dynamic mechanism for allosteric activation of Aurora kinase A by activation loop phosphorylation

Emily F. Ruff, Joseph M. Muretta, Andrew Thompson, Eric W. Lake, Soreen Cyphers, Steven K. Albanese, Sonya M. Hanson, Julie M. Behr, David D. Thomas,  John D. Chodera, and Nicholas M. Levinson. 
eLife 7:e32766, 2018. [DOI] [bioRxiv]

We show that, contrary to the canonical belief that activation shifts DFG-out to DFG-in populations, phosphorylation of AurA does not shift DFG-in/out equilibrium but instead remodels the conformational distribution of the DFG-in state.

Biomolecular simulations under realistic macroscopic salt conditions

Gregory A. Ross, Ariën S. Rustenburg, Patrick B. Grinaway, Josh Fass, and John D. Chodera
Journal of Physical Chemistry B 122:5466, 2018. [DOI] [bioRxiv] [simulation code] [results and analysis scripts]

We show how NCMC can be used to implement an efficient osmostat in molecular dynamics simulations to model realistic fluctuations in ion environments around biomolecules, and illustrate how the local salt environment around biological macromolecules can differ substantially from bulk.