In molecular simulations---especially simulations of complex systems like biomolecules---it's incredibly difficult to start the simulation close enough to equilibrium to avoid initial transients in properties of interest. As a result, it is almost universally recommended that some initial portion of the simulation be discarded to "equilibration". Unfortunately, there hasn't been a simple, automated, and generally applicable way to do this that is standard practice in the field.
In a new manuscript draft posted to bioRxiv this morning, I show how an amazingly simple approach---simply maximizing the number of statistically uncorrelated samples in the latter part of the simulation---can lead to a surprisingly robust and useful algorithm for equilibration detection. This is very much a work in progress, so comments and feedback is very much appreciated!
All code needed to grab the exact versions of the tools I used (using the conda package installer and the omnia molecular simulation suite), generate the simulation data, analyze it, and generate the figures for the paper is available on GitHub: You simply need to run
to regenerate everything---which is exactly what I did to generate the figures in the posted version of the manuscript. There are still a few improvements I hope to make the scripts easier to read and the data easier to deal with, but hopefully we can try to attain this level of ultra-simple reproducibility in future work as well.
Update [5 July 2015]: The manuscript has been updated based on valuable feedback I've already received! Thanks to everyone who has made comments!