Up2Europe est un accélérateur d’idées pour des projets de coopération.
La plateforme Ma Région Sud fait partie de l'écosystème de Up2Europe qui permet de booster la coopération à un niveau supérieur!
Besoin d'aide ? La Région Sud vous accompagne
Laissez-vous guider par notre équipe d'experts ! Saisissez votre mail et nous reviendrons vers vous rapidement
Multilingual Lexicon Extraction from Comparable Co.. (MULTILEX)
Multilingual Lexicon Extraction from Comparable Corpora
(MULTILEX)
Date du début: 1 sept. 2014,
Date de fin: 31 août 2018
PROJET
TERMINÉ
"Given large collections of parallel (i.e. translated) texts, it is well-known how to, by successively applying a sentence- and aword-alignment step, establish correspondences between words across languages. However, parallel texts are a scarceresource for most language pairs involving lesser-used languages. On the other hand, human second language acquisitionseems not to require the reception of large amounts of translated texts, which indicates that there must be another way ofcrossing the language barrier. Apparently, the human capabilities are based on looking at comparable resources, i.e. textsor speech on related topics in different languages, which, however, are not translations of each other. Comparable (writtenor spoken) corpora are far more common than parallel corpora, thus offering the chance to overcome the data acquisitionbottleneck. Despite its cognitive motivation, in the proposed project we will not attempt to simulate the complexities ofhuman second language acquisition, but will show that it is possible by purely technical means to automatically extractinformation on word- and multiword-translations from comparable corpora. The aim is to push the boundaries of currentapproaches, which typically utilize correlations between co-occurrence patterns across languages, in several ways: 1)Eliminating the need for initial lexicons by using a bootstrapping approach which only requires a few seed translations. 2)Implementing a new methodology which first establishes alignments between comparable documents across languages,and then computes cross-lingual alignments between words and multiword-units. 3) Improving the quality of computed wordtranslations by applying an interlingua approach, which, by relying on several pivot languages, allows a highly effectivemulti-dimensional cross-check. 4) We will show that, by looking at foreign citations, language translations can even bederived from a single monolingual text corpus."
Accédez au prémier réseau pour la cooperation européenne
Se connecter
Bonjour, vous êtes sur la plateforme Région Sud Provence-Alpes-Côte d’Azur dédiée aux programmes thématiques et de coopération territoriale. Une équipe d’experts vous accompagne dans vos recherches de financements.
Contactez-nous !
Contactez la Région Sud Provence-Alpes-Côte d'Azur
Vous pouvez nous écrire en Anglais, Français et Italien