Critical Assessment of Function Annotation

The Critical Assessment of Functional Annotation (CAFA) is an experiment designed to provide a large-scale assessment of computational methods dedicated to predicting protein function.^[1] Different algorithms are evaluated by their ability to predict the Gene Ontology (GO) terms in the categories of Molecular Function, Biological Process, and Cellular Component.

The experiment consists of two tracks: (i) the eukaryotic track, (ii) the prokaryotic track. In each track, a set of targets is provided by the organizers. Participants are expected to submit their predictions by the submission deadline, after which they are assessed according to a set of specific metrics.

Motivation

The genome of an organism may consist of hundreds to tens of thousands of genes, which encode for hundreds of thousands of different protein sequences. Due to the relatively low cost of genome sequencing, determining gene and protein sequences is fast and inexpensive. Thousands of species have been sequenced so far.^[2] Determining what a protein does in a cell, on the other hand, is time consuming and expensive. Even when functional assays are performed they are unlikely to provide complete insight into protein function, Therefore it has become important to use computational tools in order to functionally annotate proteins. In short, given proteins with known function, computational function predictors need to infer functions of all the remaining proteins.

The CAFA experiment is designed to provide unbiased assessment of computational methods, to stimulate research in computational function prediction, and provide insights into the overall state-of-the-art in function prediction.

Organization

The experiment consists of three phases:

Prediction phase: ~4 months
Organizers provide protein sequences with unknown or incomplete function to community and set the deadline for the submission of predictions
Target accumulation: 6-12 months
After all predictions are stored and the experiment enters a waiting period in which protein functions are expected to accumulate in public databases
Analysis Phase: 1 month
Predictors are ranked according to their performance.

The results are publicly shared in scientific meetings and published after peer review.

History

The CAFA experiment is conducted by the Automated Function Prediction (AFP) Special Interest Group (AFP/SIG). An AFP/SIG meeting has been held alongside the Intelligent Systems for Molecular Biology conference in 2005, 2006, 2008, 2011, and 2012. The first CAFA experiment was organized between fall 2010 and spring 2012. The organizers provided 48,000 sequences for the community with the task to prediction Gene Ontology annotations for each of these sequences. Of those 48,000 proteins, 866 were experimentally annotated during target accumulation phase. The results showed that current function prediction algorithms perform significantly better than a simple domain assignment or a straightforward use of BLAST package. However, they also revealed that accurate prediction of a protein's biological function is still an open and challenging problem.

References

^ Predrag, Radivojac (2013). Nature Methods. 10: 221–227. {{cite journal}}: Missing or empty |title= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)
^ Bernal, Axel (2001). Nucleic Acids Research. 29.1: 126–127. {{cite journal}}: Missing or empty |title= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)

External links

Automated Function Prediction Special Interest Group - CAFA Challenge participation information

References

External links

Automated Function Prediction Special Interest Group - CAFA Challenge participation information

[1] Predrag, Radivojac (2013). Nature Methods. 10: 221–227. {{cite journal}}: Missing or empty |title= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)

[2] Bernal, Axel (2001). Nucleic Acids Research. 29.1: 126–127. {{cite journal}}: Missing or empty |title= (help); Unknown parameter |coauthors= ignored (|author= suggested) (help)

[1]

[2]