Probabilistic Models in Pseudo-Euclidean Spaces (PROMOS)
Date du début: 2 janv. 2014, Date de fin: 2 avr. 2016 PROJET  TERMINÉ 

Biological sequence databases are a core source of information in the life sciencesand have nowadays grown to multiple thousand entries. Classically, a query of a sequence to such adatabase requires the comparison of the query to each entry using an alignment algorithm,like fasta, smith-Waterman or blast. Many realtime and high-througput experiments relyon a quick identification of the query to decide the next steps in the experimental pipelineand are currently slown down by the costs of the classical retrieval systems.The main objective of this proposal is to provide quick large-scale identificationalgorithms in non-metric spaces, induced by the scoring functions for sequence-alignments.Thereby, the proposal aims on techniques which avoid the full calculations of the scoringsduring training and retrieval, employing different mathematical and probabilistic approximation techniques.