search go!
Hrvatski  ::  English  
Effect of sequence length on behaviour of codon usage measures.

Effect of sequence length on behaviour of codon usage measures (see full paper for explanation)

What are MILC and MELP?

There are a number of methods (also called: measures) currently in use that quantify codon usage in genes. These measures are often influenced by other sequence properties, such as length. This can introduce strong methodological bias into measurements; therefore we attempted to develop a method free from such dependencies.

What did we do?

We compared the performance of several commonly used measures and a novel method we introduce – Measure Independent of Length and Composition (MILC). Large, randomly generated sequence sets were used to test for dependence on:

  • sequence length
  • overall amount of codon bias and
  • codon bias discrepancy in the sequences.

A derivative of the method, named MELP (MILC-based Expression Level Predictor) can be used to quantitatively predict gene expression levels from genomic data. It was compared to other similar predictors by examining their correlation with actual, experimentally obtained mRNA or protein abundances.

Our conclusion...

We have established that MILC is a generally applicable measure, being resistant to changes in gene length and overall nucleotide composition, and introducing little noise into measurements. Other methods, however, may also be appropriate in certain applications.

Our efforts to quantitatively predict gene expression levels in several prokaryotes and unicellular eukaryotes met with varying levels of success, depending on the experimental dataset and predictor used. Out of all methods, MELP and Rainer Merkl's GCB method had the most consistent behaviour. A 'reference set' containing known ribosomal protein genes appears to be a valid starting point for a codon usage-based expressivity prediction.


Read the whole paper

Fran Supek, Kristian Vlahoviček. Comparison of codon usage measures and their applicability in prediction of microbial gene expressivity. BMC Bioinformatics. 2005 Jul 19;6:182.
Free Full Text in BMC Bioinformatics >>>


Use MILC in your work

MILC is the default codon usage measure used in the INCA software package, freely available for academic use on Windows and Linux. Go here to download it. Of course, you may also implement MILC and MELP in your own scripts and programs.