Lingüística computacional 1
Computational Linguistics I (31395)
Juan María Garrido Almiñana
Goals
The goal of the course is twofold: to offer to the students a general panorama of the basic theoretical concepts related to the computational analysis and processing of speech, and to provide them with the basic skills in the use of tools and resources for speech analysis and speech technology.
The course is mainly oriented to students with some previous background in Linguistics interested in the use of computer techniques both for basic research in Phonetics and Phonology and for professional work as a computational linguist in the Speech Technology field. For this reason, the course will focus on the concepts necessary to use and develop computer speech processing tools as a computational linguist, rather than putting emphasis on describing the engineering, mathematical or programming knowledge behind these tools.
Competences
1. Basic knowledge of the theoretical concepts on Acoustic Phonetics and computational speech processing.
2. Basic skills in the use of Praat and other speech processing tools.
3. Basic knowledge of the research procedures and methodologies used in the area of computational speech analysis.
4. Basic knowledge of the procedures and methodologies used in the development of commercial speech technologies.
5. Basic abilities in the realisation of small projects related to computational speech processing.
Contents
1. Speech signals
Speech waves. Basic parameters: time, amplitude and frequency. Periodic and aperiodic signals. Simple and complex signals. Spectral analysis: the Fourier transform. Acoustic model of speech production: sources and filters. Types of sources in speech production. Filters and resonators. Acoustic features of speech signals: spectral envelope, formants, F0.
2. Speech digitisation, coding and storing
Analogical-to-digital (A/D) conversion. Analogical and digital signals. Sampling. Sampling frequency. A/D converter resolution. Aliasing. Clipping. Speech coding: needs and applications. Speech coding methods. Storing speech: files and formats.
3. Speech analysis and modelling
Computational methods for the experimental analysis of speech. Speech analysis tools: Praat. Basic representation methods: spectra, spectrograms, F0 contours. Identifying acoustic features in speech. Making procedures automatic: scripts. Using large corpora in speech research. Modelling in speech research. Automatic modelling: MoMel-IntSint, MelAn.
4. Speech synthesis
3.1. Synthesis methods and techniques
Concept of speech synthesis. Types: analysis-by-synthesis, natural speech modification, speech generation. Analysis-by-synthesis techniques: LPC, sinusoidal. Speech modification: Overlap-Add techniques. Speech generation: formant synthesis, articulatory synthesis.
3.2. Text-to-speech systems
Definition. Typical structure of a text-to-speech (TTS) system. Linguistic processing in TTS: pre-processing, letter-to-sound, morpho-syntactic analysis, prosodic analysis. Speech wave generation: unit selection, F0 and duration prediction, speech signal modification. Developing TTS systems.
5. Speech recognition
5.1. Recognition methods and techniques
Concept of speech recognition. Steps in the recognition process: parametrisation, acoustic recognition. Parameters used in speech recognition. Recognition techniques: HMM, neural nets, linguistic rules.
5.2. Speech recognition systems
Definition and typical steps: parametrisation, acoustic recognition, linguistic post-processing. Developing speech recognition systems. Other types of recognition systems: speaker recognition, language recognition, emotion recognition.
6. Speech corpora
6.1 Definition and features
Types and applications. Transcription and annotation levels: words, phones, syllables, intonation groups. Prosodic and linguistic annotation. Some examples: ALBAYZIN, C-ORAL-ROM, Glissando.
6.2. Developing speech corpora
Design and collection. Recording. Orthographic and phonetic transcription. Annotation and time-alignment. Tools for speech transcription. Segmentation and annotation tools.
Course organisation
Each session will be organised in two parts, the first one devoted to theoretical issues, and the second one to the realisation of practical activities related to the theoretical contents of the first part. These activities will be done by the students with the support of the teacher, as part of the learning process.
Evaluation
The grade for the course will be calculated considering the results of:
· two practical exercises done over the course (40% of the grade)
· one practical work, proposed by the student considering his/her interests, or chosen from a list provided by the teacher (60% of the final grade).
General readings
GOLD, B- MORGAN, N. (2000).- Speech and Audio Signal Processing, Processing and Perception of Speech and Music, Wiley. UPF
HARRINGTON, J. - CASSIDY, S. (1999).- Techniques in Speech Acoustics, Dordrecht, Kluwer Academic Publishers. UPF
O´SHAUGHNESSY, D. (1987).- Speech Communication. Human and Machine. Addison Wesley Series in Electrical Engineering, 2na edició, 2000.
SCHROEDER, M. R. (1999).- Computer Speech. Recognition, Compression, Synthesis, Springer-Verlag. UPF