Rechercher des projets européens

Hierarchical Motif Vectors for Protein Alignment and Functional Classification (Motif Vectors)
Date du début: 11 nov. 2009, Date de fin: 10 nov. 2012 PROJET  TERMINÉ 

This proposal introduces hierarchical motif vectors for numerical analysis of sequence motifs, and develops a novel framework for alignment and functional classification of proteins. Hierarchical motif vectors will be computed using multi-scale decompositions of property sequences obtained by converting amino acid sequences into numeric sequences of various amino acid properties. These hierarchical motif vectors will capture the variations of amino acid properties in the vicinity of each amino acid in the sequence of a given protein. We will develop alignment algorithms for amino acid sequences that match their hierarchical motif vectors. We will also use unsupervised statistical learning algorithms to identify hierarchical motif vectors specific to functional protein groups, notably the antigen binding proteins, transcription factors, growth factors, and glycosylation proteins. We will then apply these methods to protein classification, using the overlap scores from the hierarchical motif vector-based sequence alignment as well as the presence and extent of hierarchical motif vectors specific to the protein group in consideration. We will validate all methods developed in this project against existing sequence alignment, motif detection, and protein classification algorithms in the literature. Among the innovations of the project is the use of hierarchical motif vectors for characterization of local physico-chemical variations along an amino acid sequence. This allows analyzing sequence motifs by general machine learning methods via the embedded vector space arrangement. Next, sequence alignment can be tuned to different amino acid properties at various scales, improving the potential for sequence alignment-based protein similarity in functional classification. Furthermore, group-specific hierarchical motif vectors will be identified as those that occur exclusively among the members of a protein group, increasing their likelihood of bearing functional specificity.

Coordinateur

Details