From Phaistos

Revision as of 07:39, 16 July 2012; view current revision
←Older revision | Newer revision→
Our article on a probabilistic model of the C-alpha geometry of proteins made the cover of the September 2006 issue of PLoS Computational Biology
Our article on a probabilistic model of the C-alpha geometry of proteins made the cover of the September 2006 issue of PLoS Computational Biology

PHAISTOS is a Markov chain Monte Carlo framework for protein structure simulations. It contains a variety of both established and novel moves types, and provides support for several force-fields from the literature. In addition, an interface to the Muninn generalized ensemble package makes it possible to easily conduct multi-histogram based simulations, avoiding the convergence problems often associated with Metropolis-Hastings based sampling.

A greedy collapse using the TorusDBN and radius of gyration as an energy function.

A unique feature of PHAISTOS is the use of probabilistic models to capture essential structural properties in proteins. These models are available both as proposal distributions (moves), and for likelihood evaluations (energies). This increases the flexibility when settings up a simulation, by allowing the user to choose how to incorporate the bias provided by these models in the simulation. For instance, similar to the use of fragment or rotamer libraries, using probabilistic models for sampling of backbone angles and sidechain angles corresponds to having an implicit energy term present in the simulation. Unlike fragment and rotamer libraries, however, when using probabilistic models, this term can be evaluated and compensated for if necessary. PHAISTOS currently incorporates models for the CA-only representation of protein backbones (FB5HMM), full-atom backbones (TORUSDBN), full-atom sidechains (BASILISK), and single-mass sidechains (COMPAS).

PHAISTOS also contains a highly efficient local move, CRISP, which is capable of locally resampling short stretches of the protein backbone, without violating the local geometry of the chain. This move was recently demonstrated to outperform current state-of-the-art local move algorithms. In addition, it was shown that using this move, it was possible to explore native ensembles of proteins with similar efficiency as Molecular Dynamics.

Finally, PHAISTOS contains tools to conduct simulations under restraints from experimental data. In the current release, we have support for SAXS data and NMR chemical shift data, but this will be extended to other data types in future releases.

Related Projects

  • Muninn, A framework for conducting generalized ensemble simulations.
  • Mocapy++, a C++ toolkit for inference and learning in dynamic Bayesian networks that supports directional statistics. Directional statistics is the statistics of angles an directions, which is especially useful for the formulation of probabilistic models of biomolecular structure. We used this toolkit to formulate and train our probabilistic models of protein structure.

PHAISTOS-related references

  • Hamelryck, T., Kent, J., Krogh, A. (2006) Sampling realistic protein conformations using local structural bias. PLoS Comput. Biol., 2(9): e131. Download pdf.
  • Boomsma, W., Mardia, KV., Taylor, CC., Ferkinghoff-Borg, J., Krogh, A. and Hamelryck, T. (2008) A generative, probabilistic model of local protein structure. Proc. Natl. Acad. Sci. USA, 105, 8932-8937. Download pdf.
  • Borg, M., Mardia, KV., Boomsma, W., Frellsen, J., Harder, T., Stovgaard, K., Ferkinghoff-Borg, J., Røgen, P., Hamelryck, T. A probabilistic approach to protein structure prediction: PHAISTOS in CASP9. LASR 2009 - Statistical tools for challenges in bioinformatics, pp. 65-70. Leeds university press, Leeds, UK. Download pdf.
  • Harder, T., Boomsma, W., Paluszewski, M., Frellsen, J., Johansson, KE., Hamelryck, T. (2010) Beyond rotamers: A generative , probabilistic model of side chains in proteins. BMC Bioinformatics, 11:306. Download pdf.
  • Stovgaard, K., Andreetta, C., Ferkinghoff-Borg, J., Hamelryck, T. (2010) Calculation of accurate small angle X-ray scattering curves from coarse-grained protein models. BMC Bioinformatics, 11:429. Download pdf.
  • Hamelryck, T., Borg, M., Paluszewski, M., Paulsen, J., Frellsen, J., Andreetta, C., Boomsma, W. Bottaro, S., Ferkinghoff-Borg, J. (2010) Potentials of mean force for protein structure prediction vindicated, formalized and generalized. PLoS ONE, 5(11): e13714. Download pdf.
  • Olsson, S., Boomsma, W., Frellsen, J., Bottaro, S., Harder, T., Ferkinghoff-Borg, J., Hamelryck, T. (2011) Generative probabilistic models extend the scope of inferential structure determination. J. Magn. Reson. 213(1), 182-6. Pubmed.
  • Harder, T., Borg, M., Boomsma, W., Røgen, P., Hamelryck, T. (2011) Fast large-scale clustering of protein structures using Gauss integrals. Bioinformatics. Download pdf
  • Bottaro, S., Boomsma, W., Johansson, K.E., Andreetta, C., Hamelryck, T. and Ferkinghoff-Borg, J. (2012) Subtle Monte Carlo updates in dense molecular systems. Journal of Chemical Theory and Computation. 8 (2), 695-702. HTML at journal
  • Harder, T., Borg, M., Bottaro, S., Boomsma, W., Olsson, S., Ferkinghoff-Borg, J., Hamelryck, T. (2012) An efficient null model for conformational fluctuations in proteins. Structure, 20, 1028-1039. Download pdf.
  • Boomsma, W., Frellsen, J., Harder T., Bottaro, S., Johansson, K.E., Tian, P., Stovgaard, K., Andreetta, C., Christensen, A.S., Olsson, S., Valentin, J., Borg, M., Ferkinghoff-Borg, J., Hamelryck, T. PHAISTOS: A Framework for Markov Chain Monte Carlo Simulation of Proteins. Submitted.


The development of PHAISTOS was made possible through grants from the Danish Council for Independent Research, the Danish Council for Strategic Research, the Novo Nordisk STAR Program, and Radiometer (DTU).