Modelling the Evolution of Protein Coding Sequences Sampled from Measurably Evolving Populations

Matthew Goode [1]
St├ęphane Guindon [1,2]
Allen Rodrigo [1] (

[1] The Bioinformatics Institute New Zealand and the Allan Wilson Centre for Molecular Ecology and Evolution, University of Auckland, Private Bag 92019, Auckland, New Zealand
[2] Department of Statistics, University of Auckland, Private Bag 92019, Auckland, New Zealand


Models of nucleotide or amino acid sequence evolution that implement homogeneous and stationary Markov processes of substitutions are mathematically convenient but are unlikely to represent the true complexity of evolution. With the large amounts of data that next generation sequencing promises, appropriate models of evolution are important, particularly when data are collected from ancient and sub-fossil remains, where changes in evolutionary parameters are the norm and not the exception. In this paper, we describe a new codon-based model of evolution that applies to Measurably Evolving Populations (MEPs). A MEP is defined as a population from which it is possible to detect a statistically significant accumulation of substitutions when sequences are obtained at different times. The new model of codon evolution permits changes to the substitution process, including changes to the intensity of selection and the proportions of sites undergoing different selective pressures. In our serial model of codon evolution, changes in the selective regime occur simultaneously across all lineages. Different regions of the protein may also evolve under distinct selective patterns. We illustrate the application of the new model to a dataset of HIV-1 sequences obtained from an infected individual before and after the commencement of antiretroviral therapy.

[ Full-text PDF | Table of Contents ]

Japanese Society for Bioinformatics