Popis předmětu - AE2M31RAT

Přehled studia | Přehled oborů | Všechny skupiny předmětů | Všechny předměty | Seznam rolí | Vysvětlivky               Návod
AE2M31RAT Speech technology in telecommunications
Role:  Rozsah výuky:2P+2C
Katedra:13131 Jazyk výuky:EN
Garanti:  Zakončení:Z,ZK
Přednášející:  Kreditů:6
Cvičící:  Semestr:L

Anotace:

The subject is devoted to basis of speech processing addressed to students of master program with special focus on communication applications as speech technology has currently many applications in communication systems. Further information can be found at http://noel.feld.cvut.cz/vyu/ae2m31rat . Detailed information for registered students can be found at teaching portal http://moodle.kme.feld.cvut.cz .

Cíle studia:

The goals of the subject is to introduce used speech technology in the most important communication applications. Students should manage the knowledge as basic characteristics of speech signal, speech coding, speech enhancement, speech recognition, speech synthesis, etc. Students will practice basic tasks of speech processing in MATLAB environment and also other publicly available tools for speech analysis will be used. As a homework, students will elaborate semester project which will be presented at the exercise according to planned schedule.

Osnovy přednášek:

1. Introduction - speech signal, basic characteristics, speech production model
2. Digitalization and basic coding strategies (PCM, ADPCM, a-law)
3. Spectral characteristics of speech signal (DFT a LPC spectrum, LSF a LSP)
4. Vocoders used in telecommunications (RPE-LTP, CELP, ACELP)
5. Methods of noise suppression for speech signals (channel and acoustic noises, VAD)
6. Echo cancellation in speech signal
7. Measurement of speech quality (subjective and objective methods)
8. Principles of speech recognition: basic tasks, feature extraction, DTW algorithm
9. Small vocabulary recognizer based on HMM (HTK toolkit)
10. Speaker recognition: verification and identification.
11. Speech synthesis - basic principles (concatenative and formant synthesis, PSOLA)
12. Voice controlled dialogue communication systems
13. Packet loss concealment for speech transmitted via communication channel
14. Further application of speech processing in communication systems. Reserve

Osnovy cvičení:

1. Introduction: speech signal, tools for analysis, sources of speech signals
2. Basic time-domain characteristics: energy, intensity, zero-crossing, fundamental frequency
3. Spectral characteristics: short-time DFT and LPC spectrum, spectrogram
4. LPC based vocoder: implementation of particular functional blacks
5. Suppression of additive noise in speech signal
6. Echo cancellation
7. Cepstrum and cepstral distance: voice activity detection, features for recognition
8. DTW based recognition: simple recognizer of particular words
9. HMM based recognition: basic tasks and demonstration of HMM modelling
10. Speaker verification based on GMM
11. Speech synthesis: implementation of formant synthesis, demonstration of available tools
12. Semester work presentations
13. Semester work presentations
14. Reserve. Credits

Literatura:

[1] Rabiner, L., Schafer, R. W.: Introduction to Digital Speech Processing Foundations and Trends in Signal Processing). Now Publishers Inc, 2007.
[2] Huang, X., Acero, A., Hon, H.-W.: Spoken Language Processing. Prentice Hall, 2001.
[3] Deller Jr., J. R., Hansen, J. H. L., Proakis, J. G.: Discrete-time Processing of Speech Signals. Wiley, 2000.
[4] McLoughlin, I.: Applied Speech and audio Processing: With Matlab Examples. Cambridge University Press, 2009.
[5] Jelinek, F.: Statistical Methods for Speech Recognition (Language, Speech, and Communication). The MIT Press, 1998.
[6] ITU-T Recommendations - http://www.itu.int/ITU-T

Požadavky:

Bases of digital signal processing are supposed as preliminary knowledge.

Klíčová slova:

speech processing, speech recognition, speech enhancement, speech coding, speech synthesis

Předmět je zahrnut do těchto studijních plánů:

Plán Obor Role Dop. semestr


Stránka vytvořena 16.4.2024 15:50:56, semestry: Z/2023-4, Z/2024-5, L/2023-4, připomínky k informační náplni zasílejte správci studijních plánů Návrh a realizace: I. Halaška (K336), J. Novák (K336)