Lingüística computacional 2
|
Computational Linguistics 2 (31396) Master Program: Màster en Lingüística Teòrica i Aplicada
1. Course presentation The course is a general presentation of the aspects of natural langauge processing related to the structure of words and sentences. The course combines the theoretical presentation of the main topics in the area with a practical approach to the main strategies. It is comprehensive and covers both symbolic and statistical approaches to natural language processing. The course can be followed by students interested in acquiring a general overview of the filed, and by those interested in a deeper uinderstanding of the topics covered.
2. Objectives The main objective of the course is to learn the techniques currently used in the computational treatment of morphology and syntax, that is, of words and sentences as strings of words. When following it the student will: * know, write and use morphological processors * know, write and use syntactic processors * know, write and use morphosyntactic taggers
3. Syllabus 1. Regular expressions and Finite-State Automata 2. Computational morphology and Finite-State Transducers 1. survey of basic aspects of morphology 2. the lexicon and morphotactics 3. orthographic rules 4. morphological analysis with finite-state transducers 3. n-gram language models 1. què i com comptem en els corpus lingüístics, 2. n-grames simples , 3. smoothing i altres tècniques de millora dels models de n-grames. 4. Part-of-Speech Tagging 1. morphosyntactic tags 2. morphosyntactic tagging 3. general problems in morphosyntactic tagging 5. Formal Grammars 1. survey of basic aspects of syntax 2. Context Free Grammars 3. Dependency Grammars 6. Syntactic Parsing 1. parsing as search 2. dynamic parsing methods 3. partial parsing 7. Features and unification 1. feature structures and unification 2. feature structures in the grammar 3. implementation of unification 4. parsing with unification grammars 5. types and inheritance 8. statistical parsing 1. probabilistic CFGs 2. problems with probabilistic CFGs 3. probabilistic lexicalised CFGs
4. Assessment The assessment of the course will be based in:
5. Methods and activities Every week, the course is organized in the following way:
6. References * Jurafsky, Daniel & Martin, James H. (2009), Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 2a edició. Prentice Hal * Bird, Steven; Klein, Ewan & Loper, Edward (2009), Natural Language Processing with Python. Analyzing Text with the Natural Language Toolkit. O'Reilly Media. Other recommended readings: * Allen, James (1994), Natural Language Understanding. 2nd edition. Addison Wesley. * Coleman, John (2005), Introducing speech and language processing. Cambridge University Press. * Manning, Christopher D. & Schütze, Hinrich (1999), Foundations of Statistical Natural Language Processing. The MIT Press. |