The Automated Protein Function Prediction Special Interest Group
The accurate annotation of protein function is key to understanding life at the molecular level. However, with its inherent difficulty and expense, experimental characterization of function cannot scale up to accommodate the vast amount of sequence data already available. The computational annotation of protein function has therefore emerged as a problem at the forefront of bioinformatics. Recently, the availability of genomic-level sequence information for thousands of species, coupled with massive high-throughput experimental data, has created new opportunities as well as challenges for function prediction. Many methodologies have been developed by research groups worldwide, many based in comparing unsolved sequences with databases of proteins whose functions are known. Other methods aim at mining the scientific literature associated with some of these proteins, yet others combine sophisticated machine-learning algorithms with an understanding of biological processes to decipher what these proteins do. Indeed, we may have already identified a protein that is an ideal drug target for cancer, but it is lost in the myriad of data labeled as "function unknown".
The mission of the Automated Function Prediction Special Interest Group (AFP-SIG) is to bring together computational biologists, experimental biologists and biocurators who are dealing with the important problem of gene and gene product function prediction, to share ideas and create collaborations. The AFP-SIG holds annual meetings alongside the ISMB. Also, we are conducting the multi-year Critical Assessment of protein Function Annotation, or CAFA, experiment.
The AFP meeting is noted for its strong community, engaged audience and premier speakers. Previous years' speakers include: Dame Janet Thornton (EBI), Peer Bork (EMBL), Barry Honig (Columbia University), Ewan Birney (EBI), Philip Bourne (UC San Diego), Russ Altman (Stanford University), Michael Sternberg (Imperial College), Steven Brenner (UC Berkeley), Amos Bairoch (Swiss Institute of Bioinformatics), Adam Godzik (UC San Diego), Simon Kasif (Boston University), David Jones (University College London), Jonathan Eisen (UC Davis), Olga Troyanskaya (Princeton University), Patricia Babbitt (UC San Francisco), Christine Orengo (University College London), Terry Gaasterland (UC San Diego), Kimmen Sjölander (UC Berkeley), Frederick Roth (University of Toronto), Shoshana Wodak (Sick Kids Toronto), Alex Bateman (EBI), Olivier Lichtarge (Baylor College of Medicine), Keith Dunker (Indiana University School of Medicine), Anna Tramontano (University of Rome, "La Sapienza"), Andrew Emili (University of Toronto), and Alfonso Valencia (CNIO, Spain).
About the CAFA challenge
The problem: There are many proteins in the databases for which the sequence is known, but the function is not. The gap between what we know and what we do not know is growing. A major challenge in the field of bioinformatics is to predict the function of a protein from its sequence or structure. At the same time, how can we judge how well these function prediction algorithms are preforming?
The solution: The Critical Assessment of protein Function Annotation algorithms (CAFA) is an experiment designed to provide a large-scale assessment of computational methods dedicated to predicting protein function, using a time challenge. Briefly, CAFA organizers provide a large number of protein sequences (large blue oval below left). The predictors then predict the function of these proteins by associating them with Gene Ontology terms or (new in 2013) Human Phenoytpe Ontology terms. Following the prediction deadline, we wait for several months. During that time, some proteins whose function were unknown experimentally have received experimental verification (hatched red area, right oval below). Those proteins constitute the benchmark, against which the methods are tested. You can read more about CAFA-1 here, and in the paper published in Nature Methods.
History of AFP meetings
The AFP meetings have been held annually (almost) since 2005. The first CAFA experiment was held in 2011-2012 and the second one is being held 8/2013-7/2014.