3D Structure prediction
One of the major unsolved problems in modern day molecular biology is the protein folding problem: given an amino acid sequence, predict the overall three-dimensional structure of the corresponding protein. It has been known since the seminal work of Christian B. Anfinsen in the early seventies that the sequence of a protein encodes its structure, but the exact details of the encoding still remain elusive. Since the protein folding problem is of enormous practical, theoretical and medical importance, and in addition forms a fascinating intellectual challenge, it is often called the holy grail of bioinformatics.
We are tackling the protein structure prediction problem from an original angle. Our group develops sophisticated probabilistic models that describe various aspects of protein structure, and uses these models in the prediction of structure from sequence. These probabilistic models are mainly based on two key ingredients: graphical models (including dynamic Bayesian networks and factor graphs), which are powerful machine learning methods, and directional statistics, the statistics of angles, directions and orientations.
It is important to note that these probabilistic models are not black box methods: they can be rigorously interpreted and used in the framework of physics, and more specifically statistical mechanics. In other words, we follow the view of Edwin T. Jaynes, who showed that statistical mechanics can be seen as a form of statistical inference based on partial information, rather than a physical theory.
We are steadily working on a program for ab initio protein structure prediction based on probabilistic models, called Phaistos. Some software is already freely available, see below.