Rechercher des projets européens

Compositional Operations in Semantic Space (COMPOSES)
Date du début: 1 nov. 2011, Date de fin: 31 oct. 2016 PROJET  TERMINÉ 

"The ability to construct new meanings by combining words into larger constituents is one of the fundamental and peculiarly human characteristics of language. Systems that induce the meaning and combinatorial properties of linguistic symbols from data are highly desirable both from a theoretical perspective (modeling a core aspect of cognition) and for practical purposes (supporting human-computer interaction). COMPOSES tackles the meaning induction and composition problem from a new perspective that brings together corpus-based distributional semantics (that is very successful at inducing the meaning of single content words, but ignores functional elements and compositionality) and formal semantics (that focuses on functional elements and composition, but largely ignores lexical aspects of meaning and lacks methods to learn the proposed structures from data). As in distributional semantics, we represent some content words (such as nouns) by vectors recording their corpus contexts. Implementing instead ideas from formal semantics, functional elements (such as determiners) are represented by functions mapping from expressions of one type onto composite expressions of the same or other types. These composition functions are induced from corpus data by statistical learning of mappings from observed context vectors of input arguments to observed context vectors of composite structures. We model a number of compositional processes in this way, developing a coherent fragment of the semantics of English in a data-driven, large-scale fashion. Given the novelty of the approach, we also propose new evaluation frameworks: On the one hand, we take inspiration from cognitive science and experimental linguistics to design elicitation methods measuring the perceived similarity and plausibility of sentences. On the other, specialized entailment tests will assess the semantic inference properties of our corpus-induced system."

Details