End-to-end differentiable molecular mechanics force field construction

Yuanqing Wang, Josh Fass, and John D. Chodera
Chemical Science 13:12016, 2022 [DOI] [arXiv] [pytorch code] [JAX code]

Molecular mechanics force fields have been a workhorse for computational chemistry and drug discovery. Here, we propose a new approach to force field parameterization in which graph convolutional networks are used to perceive chemical environments and assign molecular mechanics (MM) force field parameters. The entire process of chemical perception and parameter assignment is differentiable end-to-end with respect to model parameters, allowing new force fields to be easily constructed from MM or QM force fields, extended, and applied to arbitrary biomolecules.

Best practices for constructing, preparing, and evaluating protein-ligand binding affinity benchmarks

David F Hahn, Christopher I Bayly, Hannah E Bruce Macdonald, John D Chodera, Antonia SJS Mey, David L Mobley, Laura Perez Benito, Christina EM Schindler, Gary Tresadern, Gregory L Warren
Preprint ahead of publication: [arXiv] [GitHub]

This living best practices paper for the Living Journal of Computational Molecular Sciences describes the current community consensus in how to curate experimental benchmark data for assessing predictive affinity models for drug discovery, how to prepare these systems for affinity calculations, and how to assess the results to compare performance.

Bayesian inference-driven model parameterization and model selection for 2CLJQ fluid models

Owen C Madin, Simon Boothroyd, Richard A Messerly, John D Chodera, Josh Fass, and Michael R Shirts
Preprint ahead of publication: [arXiv]

Here, we show how Bayesian inference can be used to automatically perform model selection and fit parameters for a molecular mechanics force field.

Machine learning force fields and coarse-grained variables in molecular dynamics: application to materials and biological systems

Gkeka P, Stoltz G, Farimani AB, Belkacemi Z, Ceriotti M, Chodera JD, Dinner AR, Ferguson A, Maillet JB, Minoux H, Peter C, Pietrucci F, Silveira A, Tkatchenko A, Trstanova Z, Wiewiora R, Leliévre T.
Journal of Chemical Theory and Computation 60:6211, 2020. [DOI] [arXiv]

We review the state of the art in applying machine learning to coarse grain force fields in space and time to study mutliscale dynamics.

Graph nets for partial charge prediction

Yuanqing Wang, Josh Fass, Chaya D. Stern, Kun Luo, and John D. Chodera.
Preprint ahead of publication.
[arXiv] [GitHub]

Graph convolutional and message-passing networks can be a powerful tool for predicting physical properties of small molecules when coupled to a simple physical model that encodes the relevant invariances. Here, we show the ability of graph nets to predict partial atomic charges for use in molecular dynamics simulations and physical docking.

Towards Automated Benchmarking of Atomistic Forcefields: Neat Liquid Densities and Static Dielectric Constants from the ThermoML Data Archive

Kyle A. Beauchamp, Julie M. Behr, Ariën S. Rustenburg, Christopher I. Bayly, Kenneth Kroenlein, and John D. Chodera.
J. Phys. Chem. B 119:12912, 2015. [DOI] [PDF] // code: [GitHub] // preprint: [arXiv

Progress in forcefield validation and parameterization has been hindered by the availability of high-quality machine-readable physical property data for small organic molecules. We show how the NIST ThermoML dataset provides a solution to this problem, and demonstrate its utility in benchmarking the GAFF/AM1-BCC small molecule forcefield on neat liquid densities and static dielectric constants to uncover problems in the representation of low-dielectric environments.

A robust approach to estimating rates from time-correlation functions

John D. ChoderaPhillip J. ElmsWilliam C. SwopeJan-Hendrik PrinzSusan MarquseeCarlos BustamanteFrank NoéVijay S. Pande
Preprint ahead of submission: [arXiv] [PDF] [SI]

The estimation of rates from experimental single-molecule data is fraught with peril. We describe some of the failures of existing methods and suggest a robust way to estimate rates from time-correlation functions.