Tractament de la veu
(Hendrik Purwins)
General Information
Designed for linguistics students, this course will especially provide a basic introduction into signal processing that is necessary for central concepts in speech processing such as the Discrete Fourier Transform, Linear Prediction Analysis, and Hidden Markov Models.
Course Language: English. I apologize for not being capable of speaking Catalan. Please feel free to ask questions in Castilian. I will try to understand you. But my Castilian is poor as well at this stage.
Time and Location:
Lecture: Friday19.00-20.30 h (Ramblas 316)
Lab I: Wednesday 18.00-18.45 h (Ramblas 316)
Lab II: Thursday 19.40-20.25 h (Ramblas 102)
Assignments: Sent assignments per email to . Two People can turn in one assignment they have worked out together. Every student has to be able to present his submitted solution of the assignment in class.
Topic of Lab
Due
Octave processing
10/13 10h
1.7: 11/4 19h
Analog-Digital Conversion, Aliasing
Energy, Zero Crossing Rate, Correlation
3.1-2: 11/11 19h
Autocorrelation, Frame-by-Frame Analysis 4.1-2: 11/11 19h
Math: Trigonometry and Vectors 1-10: 11/4 19h
Moving Average and Magnitude Spectrum No.1 11/29:12h
No.2 12/2:19 h
Automatic Speech Recognition and Hidden Markov Models 12/6 12 h
Grading: There will be an oral exam at the end of the trimester. In groups of 3 people I will ask you about the homework and the lecture for half an hour. The final grade will be calculated from the exam (50 %) and the assignments (50 %).
Contact: Hendrik Purwins
Room 326, Ocata 1
Tel: 935 42 28 65
Office Hours: Tue 13-14.30 h, We 14.30-16 h
Course web site: http://www.cs.tu-berlin.de/~hendrik/parla
Preliminary Syllabus:
1 Speech Processing Tasks: Time and Frequency Domains
* Overview of Speech Processing
* Types of signal
* Sines and cosines
* Speech in the time domain
2 Basic Mathematics
* Komplex Variables
* Convolution
* Vectors and Matrices
3 Fourier analysis
* Fourier analysis and the frequency domain
* Spectrum of a speech signal
4 Filters & Sampling
* Types of filter & spectral effect-frequency response
* Source filter model of speech production
* Sampling and the Sampling Theorem
* Aliasing, interpretation in time and frequency domains
* A/D and D/A conversion
5 Simple Processing of Waveforms
* Frame-by-frame analysis
* Signal energy, zero-crossing rate
* Autocorrelation & correlation with sine
6 Frequency Domain Analysis
* Spectrum via correlation with sine/cosine
* Discrete Fourier Transform (DFT)
* Windowing
* Application of DFT/FFT
7 Linear Prediction Analysis
* Linear prediction
* Applications: LP in pitch analysis and formant tracking
* Linear prediction synthesis
* Outline: LP in speech coding
8 Hidden Markov Models
* Probabilities
* Rule of Bayes
* Hidden Markov Model
9 HMM Applications
* Isolated Word Recognition with HMMs.
10 Outlook
Follow up Course:
This course is meant to be a preparation for participating in the speech processing course of the technology department in the 2nd trimester.
Handouts: For each single lecture available from Tuesday noon, the next week after the lecture in the Oce store on your right-hand side after entering the Ramblas building.
Books on speech processing and its technical background:
Comprehensive Book:
Daniel Jurafsky, James H. Martin
Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition, Prentice Hall, 2000.
ISBN: 0130950696
Technical Introduction for Non-engineers:
John Coleman
Introducing Speech and Language Processing
Cambridge University Press, 2005.
ISBN-0-521-53069-5
Technical:
Gold B and Morgan N.
Speech and Audio Signal Processing.
Wiley, 2000.
ISBN 0-471-35154-7
Lawrence Rabiner, Biing-Hwang Juang
Fundamentals of Speech Recognition, Prentice Hall, 1993
ISBN: 0-13-015157-2
Digital Signal Processing:
Alan V. Oppenheim, Ronald W. Schafer, John R. Buck
Discrete-Time Signal Processing, Prentice Hall, 2nd Edition
ISBN: 0-13-754920-2
News:
10/29 - Office Hours Tue 13-14.30, We 14.30-16.00. I have office hours now Tuesdays 13-14.30h and Wednesdays 14.30-16.00h in my office Ocata 1, Room 326. If you have any problems with the lecture and the homework I will be glad to help you with it.
10/29 - Correction in Math Homework Nr.8
Instead of:
8.Calculate the angle between the vertical x-axis and r.
It should read:
8.Calculate the angle between the horizontal x-axis and r.
10/29 - New Submission Dates for the Assignments.
We took some time now to catch up with the mathematical basics.
Now we have to make some effort to catch up with the homework.
So the submission deadlines for the homeworks are:
1.7: 11/4 19 h
Math Homework 1-10: 11/4 19h
3.1-2: 11/11 19h
4.1-2: 11/11 19h
Please ask me in class or come to my office hours, contact me, or make an oppointment in case you have a problem solving the exercise, so I can help you with it.
11/17 - e in lab 3
In lab 3 in the function energy replace the return parameter e by an other letter, e.g. y. octave gets confused with e=2.71.....
11/21 - Assignments
Some of you reported problems with the computers in Ramblas on the weekend. I will extend the deadline for assignment 2 (exercise 1 & 2) and 3 (exercis 1 & 2) one last
other time, until tomorrow (Tuesday) 12 h. However, I think the state of the computers at Ramblas
is not a good excuse, since the assignments were due long time ago.
We have spent a long time with very basic mathematical exercises. Please understand that
towards end of this class we have to speed up in order to cover the whole range of speech processing.
We have to apply signal processing techniques in a sound and speech processing context.
There will be two more practica (longer than the ones before). The next one (due November 28th 12 am).
11/21 - octave
-There are octave implementations for Windows, for example GNU Octave and
octave under the linux emulation called cygwin (or cygwin-X). They do not work fully:
plotting, writing sounds, cut-and-paste in some does not work.
Thanks to Sara G. here a description how to install octave under cygwin.
I have not checked that myself but it seemed to had helped some of you to do the homework.
-You do not need to send me plots. Just send me the octave code how to generate them.
However, if you want to save plots, plot it and then try print("-deps2","plot.eps").
-If aplay does not work use other programs for listening to the sound, e.g. hxplay (helixplayer).
11/21 - Exam prospective date: Tuesday 13th December 17h-21h (date yet to be confirmed)
For evaluation we will have an oral group exam. That means: you can form groups of 3 people. For half an hour I will ask you about the homework and the lecture. Since some of you already have other exams on December 16th, in a discussion with some if you it turned out that Tuesday 13th December 17h-21h would be a better time to do them. IMPORTANT: Please anyone who is not available at that date, contact me immediately!
11/24 - Typo in lab 5
It should read:
y=filter([b0 b1 .... bp],[a0 a1 ... ak],x)
Set a0=1.
Instead of
Set b0=1.
11/27 - Changed Deadlines for Lab 5 and 6
You will have a little more time for Lab 5. The first task of Lab 5 is due Tuesday 11/29 12 h. For No. 1 ask Barbara. She knows how to do the exercise.
No. 2 of Lab 5 is due Friday 12/2 19 h. If you have problems with the lab please come to my class on Thursday 19.40 h in room 102. Lab 6 is due Tuesday 12/6 12 h. But try to do this one on your own so we can focus on octave and Lab 5 No. 2 on Thursday.
11/27 No class on Wednesday 11/30 18h.
No office hour on Wednesday 14.30-16.00h.
Being the organizer, I have to attend a workshop at that time. You are welcome to join. Please come to the class on Thursday 19.40 h to Ramblas room 102. I will be available on Thursday until 20.40 h or during my office hours Tuesday 11/29 13.00-14.30 h in the ocata building. If you cannot come to neither of these dates contact me so we can make an extra apointment.
11/27 Typo in Lecture Note No. 7
On page 35, last four lines, it should read
P(X(i+1)=y| X(i)=y)
instead of
P(X(i+1)=y| P(X(i)=y)
....
P(X(i+1)=eh| X(i)=y)
instead of
P(X(i+1)=eh| P(X(i)=y)
...
P(X(i+1)=stop| X(i)=eh)=0.1
instead of
P(X(i+1)=stop| P(X(i)=eh)=0.1
I will include these pages again in the lecture notes no. 8.
11/27 Groups for Oral Exam
Please mail me with whom (maximum 3 people) you want to do the oral group exam on Tuesday 13th December 17h-21h. Please let me know if you are only available part of the time.
1/7/06 Final Grades
In the list you find the grades of every single homework and the final grade.
1/7/06 Follow-Up Course
If you like the taste of programming speech processing applications, you should join my class Tractament Digital de la Parla this trimester (lecture: Friday 15-17h).
It will cover the same topics as in Tractament de la Parla, but in much more depth
with respect to programming and to a lesser extent math. To get an overview of the exercises of the course from last year please have a look at
http://www.iua.upf.es/~sstreich/tdp/
The most challenging programming exercise (in C) will be the implementation of the Levinson-Durbin algorithm for the calculation of the LPC coefficients and the Viterbi algorithm for finding the most likely state sequence in an HMM.
1/24/06 September Exams
For the September exams you have to do all the homework of the class of last trimester and bring all the homework of the class in printed form. There will be an oral exam of 30-40 min. duration.You should be able to explain all the homework. And I will ask you questions about the lecture, especially about the material that I have summarized in my last lecture.