LE-PAROLE

E-mail Print PDF

(1996-1998 - Programa da Comissão Europeia - DGXIII, Telematics Application of Common Interest - Contrato LE2 - 4017)

Partnership:
Consorzio Pisa Richerche (coordinator) - Itália
Centro de Linguística da Universidade de Lisboa - Portugal
Det Danske Sprog - OG Litterature Iskab - Dinamarca
Fundación Bosh Gimpera Universitat de Barcelona - Espanha
Goeteborgs Universitet
Dpt. of Swedisch, Sprakdata - Suécia
GSI-ERLI - França
Institiuid Teangeolaiochta Eireann - Irlanda
Institut d'Estudis Catalans - Espanha
Institut fur Deutshe Sprache - Alemanha
Institute for Language and Speech Processing - Grécia
Instituut voor Nederlands Lexicologie - Holanda
University of Birminghan - Inglaterra
University of Helsinki - Finlândia
University of Liège -Bélgica
Associated Portuguese Partnership :
Instituto de Engenharia de Sistemas e Computadores (INESC)
CLUL's Research Team:
João Malaca Casteleiro (principal researcher)
Maria Fernanda Bacelar do Nascimento (coordinator)
Corpus:
Maria Lúcia Garcia Marques
Luísa Alice Santos Pereira
José Bettencourt Gonçalves
José Manuel Feio
Lexicon:
Palmira Marrafa
Amália Mendes
José Bettencourt Gonçalves
Florbela Barreto;
Rita Veloso
Maria João Ferro 
Clara Rowland 
José Manuel Feio

Project Status:
concluded

Description

LE-PAROLE is a project that uses linguistic and informatic resources already available in the European countries in order to built corpora and lexicons according to integrated models of constitution and materials description. The use of common tools makes multilanguage connections possible and gives response to a great number of applications. For each language, a 20 million word corpus was built with harmonized design, composition and codification, including a 250.000 word tagged subcorpus. Each language lexicon is composed of 20.000 entries with syntactic and morphosyntactic information.

These materials are available, for sale, on ELDA's catalogue:

  • a 3 million words corpus with the following constitution: newspapers (65%), books (20%), magazines (5%) and varia (10%); this corpus includes a 250.000 words subcorpus (with approximately the same distribution as the main corpus) morphosyntactically annotated, following standard criteria of the PAROLE project
    http://www.elda.fr/cata/text/W0024.html
  • a lexicon with 20.000 lemma with morphosyntactic and syntactic information http://www.elda.fr/cata/text/L0035.html
Publications:

Bacelar do Nascimento, F., L. A. Pereira, J. Saramago, (2000), "Portuguese Corpora at CLUL" in Second International Conference on Language Resources and Evaluation - Proceedings, Volume III, Athens, pp. 1603-1607.

Bacelar do Nascimento, M. F. (coord.) (1999), Portuguese lexicon of multilingual LE PAROLE Lexicon, http://www.elda.fr/catalogue/text/L0035.html.

Bacelar do Nascimento, M. F. (coord.), (1999) Portuguese sub-corpus of multilingual LE PAROLE corpus, http://www.elda.fr/catalogue/text/W0024.html.

Marrafa, P., J. Gonçalves, A. Mendes e R. Veloso (1999), "A Sintaxe do LE-PAROLE", in MARRAFA, P. e MOTA, M. ª (org.) Linguística Computacional. Investigação Fundamental e Aplicações, Lisboa, Associação Portuguesa de Linguística / Edições Colibri, pp. 191-205.

Bacelar do Nascimento, M. F.,  P. Marrafa, L. A. S. Pereira, R. Ribeiro, R. Veloso e L. Wittmann, (1998), "LE-PAROLE - Do corpus à modelização da informação lexical num sistema-multifunção", Actas do XIII Encontro da Associação Portuguesa de Linguística, APL, boa, Setembro de 1998, pp. 115-134.

Last Updated on Tuesday, 09 November 2010 15:40  


Login Form