Externally indexed torrent
If you are the original uploader, contact staff to have it moved to your account
Textbook in PDF format
This book collects the contributions to the NATO Advanced Study Institute on New Advances and Trends in Speech Recognition and Coding, held in Bubi6n, Granada (Spain), from June 28th to July 10th 1993.
The goal of the ASI was to bring together the most important experts on speech recognition and coding to discuss and disseminate their most recent findings, in order to extend them throughout European and American institutions, through a good selection of students. The main topics considered in the ASI were acoustic modeling language modeling speech processing, analysis and synthesis speech coding and vector quantization and neural nets. For each of these topics, some of the best-known researchers were invited to give a lecture. In addition to these lectures, the topics were complemented with discussions and presentations of the work of those attending.
The book has been divided into five parts corresponding to the above topics, the main focuses of the Advanced Study Institute. Each part includes the lectures and a brief account of the additional contributions. The text of the contributions were assembled after the meeting and they are believed to be up-to-date upon publication.
The first part, acoustic modeling, contains four lectures and twenty contributed papers. Tutorials are presented by J.P. Haton (on problems of recognition of noisy speech), C.H. Lee (adaptive learning), G. Chollet (evaluation of systems, algorithms and databases), and B.H. Juang (statistical and discriminative approaches). Contributed papers presented results on automatic labeling, search methods, duration modeling, noisy speech, and other aspects of acoustic modeling.
The second part covers language modeling and contains six invited papers presented by R. Pieraccini (on a learning approach to natural language understanding), R. de Mori (language models), E. Vidal (grammatical inference), H. Niemann (statistical modeling), H. Ney (search strategies), and F. Jelinek (new approaches to language modeling). Ten contributed papers were presented on grammar constraints, word spotting, and other aspects of the topic. Speech processing, analysis and synthesis is the general topic of the third part. It contains the invited lecture of L.R. Rabiner, on applications of speech processing in telecommunications. Twelve contributed papers are included on signal segmentation, text-to-speech systems, prosody, and other related subjects.
The fourth part, speech coding, contains four invited papers and eight contributed papers. Lectures were given by/. Trancoso (on CELP coding), A. Gersho (a general overview), N. Farvardin (noisy channels) and J.P. Adoul (lattice and trellis coded quantizations). Contributed papers were presented on subband coding, line spectral frequencies, the application of discrete cosine transform, and other topics related to speech coding.
The last part of the book covers vector quantization and neural networks. It contains the invited lecture of A. Waibel on speech translation using neural networks, and seven contributed papers on the use of recurrent neural networks, LVQ, genetic algorithms, and other related techniques.
Part I: Acoustic Modeling
Automatic Recognition of Noisy Speech
Adaptive Leaming in Acoustic and Language Modeling
Evaluation of ASR Systems, Algorithms and Databases
Statistical and Discriminative Methods for Speech Recognition
Automatic Speech Labeling Using Word Pronunciation Networks and Hidden Markov Models
Heuristic Search Methods for a Segment Based Continuous Speech Recognizer
Dimension and Structure of the Vowel Space
Continuous Speech HMM Training System: Applications to Speech Recognition and Phonetic Label Alignment
HMM Based Acoustic-Phonetic Decoding with Constrained Transitions and Speaker Topology
Experiments on a Fast Mixture Density Likelihood Computation
Explicit Modelling of Duration in HMM: an Efficient Algorithm
Acoustic-Phonetic Decoding of Spanish Continuous Speech with Hidden Markov Models
HMM-Based Speech Recognition in Noisy Car Environment
Extensions to the AESA for Finding k-Nearest-Neighbours
An Efficient Pruning Algorithm for Continuous Speech Recognition
On the Performance of SCHMM for Isolated Word Recognition and Rejection
A Speaker Independent Isolated Word Recognition System for Turkish
The Speech Recognition Research System of the TU Dresden
A MMI Codebook Design for MVQHMM Speech Recognition
SLHMM: An ANN Approach for Continuous Speech Recognition
Medium Vocabulary Audiovisual Speech Recognition
SLAM: A PC-Based Multi-Level Segmentation Tool
Durational Modelling in HMM-based Speech Recognition: Towards a Justified Measure
Rejection in Speech Recognition for Telecommunication Applications
Part II: Language Modeling
A Learning Approach to Natural Language Understanding
Language Models for Automatic Speech Recognition
Grammatical Inference and Automatic Speech Recognition
Statistical Modeling of Segmental and Suprasegmental Information
Search Strategies For Large-Vocabulary Continuous-Speech Recognition
Two New Approaches to Language Modeling: A Tutorial
Representing Word Pronunciations as Trees
Language Models Comparison in a Robot Telecontrol Application
Keyword Propagation Viterbi Algorithm
Dialog and Language Modeling in CRIM's ATIS System
On the Use of the Leaving-One-Out Method in Statistical Language Modelling
Application of Grammar Constraints to ASR Using Signature Functions
CRIM Hidden Markov Model Based Keyword Recognition System
Modelling Phone-Context in Spanish by Using SCMGGI Models
Efficient Integration of Context-Free Language Models in Continuous Speech Recognition
Keyword Spotting, an Application for Voice Dialing
Part III: Speech Processing, Analysis and Synthesis
Telecommunications Applications of Speech Processing
Disambiguating Hierarchical Segmentations of Speech Signals
Talker Tracking using two Microphone Pairs and a CrosspowerSpectrum Phase Analysis
A Text-to-Speech Services Architecture for UNIX
Comparison of Parametric Spectral Representations for Voice Recognition in Noisy Environments
Spectral Analysis of Turkish Vowels and a Comparison of Vowel Normalization Algorithms
Can You Tell Apart Spontaneous and Read Speech if You Just Look at Prosody?
The Prosodic Marking of Phrase Boundaries: Expectations and Results
Voice Source State as a Source of Information in Speech Recognition: Detection of Laryngealizations
Voice Transformations for the Evaluation of Speaker Verification Systems
Towards a More Realistic Evaluation of Synthetic Speech: A Cognitive Perspective
A Non-Linear Speech Analysis Based on Modulation Information
The Recognition Component of the SUNDIAL Project
Part IV: Speech Coding
An Overview of Different Trends on CELP Coding
Concepts and Paradigms in Speech Coding
Speech Coding over Noisy Channels
Lattice and Trellis Coded Quantizations for Efficient Coding of Speech
8 kbit/s LD-CELP Coding for Mobile Radio
Subband Long-Term Prediction for LPC-Coders
On the Use of Interframe Information of Line Spectral Frequencies in Speech Coding
Speech Coding Using the Karhunen-L6eve Representation of the Spectral Envelope of Acoustic Subwords
Excitation Construction for the Robust Low Bit Rate CELP Speech Coder
A Discrete Cosine Transform Scheme for Low-Delay Wideband Speech Coding
MOR-VQ for Speech Coding Over Noisy Analog Channels
Improved CELP Coding Using a Fully Adaptive Excitation Codebook
Part V: Vector Quantization and Neural Nets
Recent Advances in JANUS: A Speech Translation System
On a Fuzzy DVQ Algorithm for Speech Recognition
On the Use of Recurrent Neural Networks for Grammar Learning and Word Spotting
LVQ-based Codebooks in Phonemic Speech Recognition
Distributed and Local Neural Classifiers for Phoneme Recognition
A VQ Algorithm Based on Genetic Algorithms and LVQ
Vector Quantization Based Classification and Maximum Likelihood Decoding for Speaker Recognition
Evidence Combination in Speech Recognition Using Neural Networks