CLUL Talks with Roger Evans
Natural Language Technology Group
Fully lexicalised descriptions of language -
|
|
|
|
AbstractThe history of language description since Chomsky has been a battle between order and chaos. Grammarians seek to impose the perfect order of grammatical description while language steadfastly refuses to cooperate. This is true even when for well-behaved language, but the web now provides access to vast amounts of really 'natural' language, which demonstrates that well-behaved language is far from the norm. Issues of regularity versus exception and grammaticality versus ungrammaticality go to the core of language description and language processing. Over the last 25 years, formal grammarians have been playing with more complex notions of 'grammatical category'(in GPSG, HPSG, LFG, CCG etc) and wondering whether grammars really do need to be small (or even finite) to be interesting, while morphologists have started to invent formal languages (such as DATR, XFST, Network Morphology and Paradigm-Function Morphology) powerful enough to describing the general messiness of word forms. Putting these two notions together results in a space of language description systems in which lexical complexity can be traded against grammatical complexity, and the problematic nature of 'real' language has resulted in a trend towards more and more lexicalist grammar frameworks. In this talk, I follow this trend to its logical conclusion, and explore the possibility of fully lexicalised descriptions of language. The idea of doing this has been around for a while, but achieving it requires a few hurdles to be overcome, not least a reconception of what we are trying to describe, a language powerful enough to describe it and a different computational (or cognitive?) model of where and how descriptive 'work' gets done. I shall address these issues, with examples drawn from linguistics, computational linguistics and formal language theory, and introduce the 'Extended Lexicon Framework' (ELF), a tool we are currently developing to support this approach to linguistic description. |
|
Short BiographyRoger Evans is a reader in Computer Science and leader of the Natural Language Technology Group (NLTG) at the University of Brighton. He completed his DPhil in Computational Lnguistics at the University of Sussex in 1987 on Generalised Phrase Structure Grammar (GPSG, the forerunner of HPSG) supervised by Gerald Gazdar and Chris Mellish. In 1988 he was awarded a five year SERC Advanced fellowship to study the relationship between structure and processing in language. During this time he co-developed, with Gerald Gazdar, the lexical representation language DATR, which has been widely used in a range of linguistic and computational linguistic settings. In 1993 he moved to the University of Brighton, where he was deputy head of the Information Technology Research Institute (1994-2000) and in 2005 established and leads the NLTG. His research interests include lexical representation, information extraction and text mining, natural language generation (he is currently chair of ACL-SIGGEN) and architectures for natural language processing systems. His current research focuses on the development of the Extended Lexicon Framework (ELF) system, probabilistic extensions to DATR, and the use of this technology in Cultural Informatics applications, in the EC-funded Framework 7 Integrated Project, 3D-COFORM. He has a long standing collaboration with the Surrey Morphology Group, currently working with Dunstan Brown on unsupervised learning of morphological classes, and is a senior visiting research fellow at the University of Sussex. Website of Roger Evans Website of the Natural Language Technology Group at the University of Brighton,UK | |
|
For more information, please contact one of the CLUL Talks organizers, the CLUL post-docs FCT Ciência 2007-2008: Michel GénéreuxTjerk Hagemeijer Iris Hendrickx Maria do Carmo Lourenço-Gomes Javier Arias Navarro |
|