Centro de Linguística da Universidade de Lisboa (coordinating institution)
University of Toulouse-le-Mirail (Responsible: Paul Rivenc)
University of Provence-Aix-Marseille (Responsible: Cl. Blanche-Benveniste)
|CLUL's Research Team:
João Malaca Casteleiro (principal researcher)
Maria Fernanda Bacelar do Nascimento (coordinator)
Maria Lúcia Garcia Marques
José Bettencourt Gonçalves
João Miguel Casteleiro (computer consultant)
Isabel Sampaio (computer technician)
|Corpus download :
S. Tome e Principe.zip
Instrucoes de utilizacao.txt
Programa Lingua (executavel).zip
|The project is concluded and the materials are published in CD-ROM, with the exclusive publishing support of Instituto Camões, under the title Português Falado - Documentos Autênticos: Gravações áudio com transcrição alinhada. Its distribution outside of Portugal is ensured by Instituto Camões and in Portugal by CLUL. From the original project a corpus of samples of the Portuguese varieties spoken in Portugal, Brazil, the African countries with Portuguese as its official language and Macao was derived. The published materials also include samples of the Portuguese spoken in Goa and in East-Timor, collected later. These samples of oral speech, recorded in various places, situations and periods of time, go together with the correspondent aligned orthographic transcriptions.
The four published CD-ROMs include a spoken Portuguese corpus - with aligned sound and orthographic transcription - collected among sociolinguistically diverse speakers having Portuguese as mother tongue or as second language. This corpus consists of informal conversations between acquaintances, friends or relatives as well as formal acts as, for instance, radio programs or conferences. In a total of 86 recordings, the texts exemplify the Portuguese spoken in Portugal (30), in Brazil (20), in the African countries with Portuguese as its official language: Angola, Cape Verde, Guinea-Bissau, Mozambique and Sao Tome and Principe (5 each), in Macao (5), in Goa (3) and in East-Timor (3), corresponding to 8h44m of recording and to 91.966 tokens. The recordings cover a period that goes from 1970 to 2001, and approximately 70% of them fall upon the last decade.
These samples of Portuguese varieties are distributed in the four CD-ROMs in the following way:
Finally, 94 speakers appear in the recordings; their characterizations (origin, sex, age, professional status, level of education) are visible on the header of each transcription, in which is also given information about the place, date and situation in which the recording was made, as well as other relevant types of information.
Bacelar do nascimento, F. (2001), (coord.) Português Falado, Documentos Autênticos, Gravações audio com transcrições alinhadas, em CD-ROM, Lisboa, Centro de Linguística da Universidade de Lisboa e Instituto Camões.
Bacelar do Nascimento, M. F., L. A. S. Pereira e J. Saramago 2000. "Portuguese Corpora at CLUL". In Second International Conference on Language Resources and Evaluation – Proceedings, Volume II, Athens: 1603-1607.
Bacelar do Nascimento, M. F. (2001), "Les études portugaises sur la langue parlée" in CARREIRA, M. H. A. (org.) Travaux et Documents, Les langues romanes en dialogue(s), 11-2001, Université Paris 8, Vincennes Saint-Denis, pp. 209-221.
Bacelar do Nascimento, M. F. et alii (2001), Poster "Português Falado " in Feira de Projectos, promovida pela Comissão Nacional do Ano Europeu das Línguas, Lisboa, Centro Cultural Casapiano, 27-30 de Setembro.
Bettencourt Gonçalves, J. (2000), "Português Falado: variedades geográficas e sociais”, in Estudos de gramática portuguesa (1) Eberhard Gärtner, Christine Hundt, Axel Schönberger (eds.), Frankfurt am Main.
Bettencourt Gonçalves, J. e R. Veloso (2000), "Spoken Portuguese: Geographic and Social Varieties”, in Proceedings of the Second International Conference on Language Resources and Evaluation, Volume II, National technical University of Athens Press, Athens, Greece, pp. 905-908
Pereira, L. A. S. (2004) "The use of concordancing in Portuguese teaching". In Sinclair, J. M. (ed.) How to Use Corpora in Language Teaching. Amsterdam, John Benjamins P. C.: 109-122.