Subject description - AE2M31ZRE

Summary of Study | Summary of Branches | All Subject Groups | All Subjects | List of Roles | Explanatory Notes               Instructions
AE2M31ZRE Speech processing Extent of teaching:2+2c
Guarantors:  Roles:PO,V Language of
Teachers:  Completion:Z,ZK
Responsible Department:13131 Credits:6 Semester:L


The subject is devoted to basis of speech processing addressed to students of master program with special focus on multimedia applications. Discussed speech technology is currently applied in many systems in different fields (e.g. information dialogue systems, voice controlled devices, dictation systems or transcription of audio-video recordings, support for language teaching, etc.). Further information can be found at and at

Study targets:

The goals of the subject is to introduce used speech technology in the most important multimedia applications. Students should manage the knowledge as basic characteristics of speech signal, speech enhancement, speech recognition, speech synthesis, audio-visual speech processing, etc. Students will practice basic tasks of speech processing in MATLAB environment and also other publicly available tools for speech analysis will be used. As a homework, students will elaborate semester project which will be presented at the exercise according to planned schedule.

Course outlines:

1. Introduction - speech signal (digital form), speech production model
2. Basic characteristics of speech signal, phonetic and articulatory aspects
3. Spectral characteristics of speech signal (DFT and LPC spectrum)
4. Noise suppression in speech signal (additive and convolution noise, one-channel, multi-channel)
5. Hearing aids and cochlear implants (anatomy and hearing model, speech processing)
6. Principles of speech recognition, basic tasks ad applications
7. Feature extraction for speech recognition
8. Small vocabulary speech recognition based on DTW and HMM (HTK)
9. Dictation and transcription systems (large vocabulary speech recognition)
10. Speaker verification and identification.
11. Speech synthesis - basic principles (concatenative and formant synthesis, PSOLA)
12. Audio-visual speech recognition
13. Multimedia systems with voice input (dialog systems, logopaedy, language teaching)
14. Language recognition. Reserve.

Exercises outline:

1. Introduction: speech signal, tools for analysis, sources of speech signals
2. Basic time-domain characteristics: energy, intensity, zero-crossing, fundamental frequency
3. Spectral characteristics: short-time DFT and LPC spectrum, spectrogram
4. Suppression of additive noise in speech signal
5. Convolutory noise suppression
6. Speech processing for hearing aids and cochlear implants
7. Cepstrum and cepstral distance: voice activity detection, features for recognition
8. DTW based recognition: simple recognizer of particular words
9. HMM based recognition: basic tasks and demonstration of HMM modelling
10. Speaker verification based on GMM
11. Speech synthesis: implementation of formant synthesis, demonstration of available tools
12. Semester work presentations
13. Semester work presentations
14. Reserve. Credits


[1] Rabiner, L., Schafer, R. W.: Introduction to Digital Speech Processing Foundations and Trends in Signal Processing). Now Publishers Inc, 2007.
[2] Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing. Prentice Hall, 2001.
[3] Deller Jr., J. R., Hansen, J. H. L., Proakis, J. G.: Discrete-time Processing of Speech Signals. Wiley, 2000.
[4] McLoughlin, I.: Applied Speech and audio Processing: With Matlab Examples. Cambridge University Press, 2009.
[5] Jelinek, F.: Statistical Methods for Speech Recognition (Language, Speech, and Communication). The MIT Press, 1998.
[6] ITU-T Recommendations -


Bases of digital signal processing are supposed as preliminary knowledge.



speech processing, speech recognition, speech synthesis, speech enhacement, speech technology applications, audio-visual speech processing

Subject is included into these academic programs:

Program Branch Role Recommended semester
MEKME2 Multimedia Technology PO 2
MEKME5 Systems of Communication PO 2
MEOI1 Artificial Intelligence V 2
MEOI5NEW Software Engineering V 2
MEOI5 Software Engineering V 2
MEOI4 Computer Graphics and Interaction V 2
MEOI3 Computer Vision and Image Processing V 2
MEOI2 Computer Engineering V 2
MEEEM1 Technological Systems V 2
MEEEM5 Economy and Management of Electrical Engineering V 2
MEEEM4 Economy and Management of Power Engineering V 2
MEEEM3 Electrical Power Engineering V 2
MEEEM2 Electrical Machines, Apparatus and Drives V 2
MEKYR4 Aerospace Systems V 2
MEKYR1 Robotics V 2
MEKYR3 Systems and Control V 2
MEKYR2 Sensors and Instrumentation V 2

Page updated 26.6.2019 17:52:50, semester: Z,L/2020-1, L/2018-9, Z,L/2019-20, Send comments about the content to the Administrators of the Academic Programs Proposal and Realization: I. Halaška (K336), J. Novák (K336)