Torrent details for "Virtanen N. Techniques for Noise Robustness in Automatic Speech Recognition 2013 [andryold1]"    Log in to bookmark

wide
Torrent details
Cover
Download
Torrent rating (0 rated)
Controls:
Category:
Language:
English English
Total Size:
7.75 MB
Info Hash:
4a3afb2474b99c14d382010efcb4781c8abe350d
Added By:
Added:  
06-12-2022 14:57
Views:
102
Health:
Seeds:
1
Leechers:
1
Completed:
180
wide




Description
wide
Externally indexed torrent
If you are the original uploader, contact staff to have it moved to your account
Textbook in PDF format

The term computer speech recognition conjures up visions of the science-fiction capabilities of HAL2000 in 2001, A Space Odessey, or Data, the anthropoid robot in Star Trek, who can communicate through speech with as much ease as a human being. However, our real-life encounters with automatic speech recognition are usually rather less impressive, comprising often-annoying exchanges with interactive voice response, dictation, and transcription systems that make many mistakes, frequently misrecognizing what is spoken in a way that humans rarely would. The reasons for these mistakes are many. Some of the reasons have to do with fundamental limitations of the mathematical framework employed, and inadequate awareness or representation of context, world knowledge, and language. But other equally important sources of error are distortions introduced into the recorded audio during recording, transmission, and storage.
As automatic speech-recognition—or ASR—systems find increasing use in everyday life, the speech they must recognize is being recorded over a wider variety of conditions than ever before. It may be recorded over a variety of channels, including landline and cellular phones, the internet, etc. using different kinds of microphones, which may be placed close to the mouth such as in head-mounted microphones or telephone handsets, or at a distance from the speaker, such as desktop microphones. It may be corrupted by a wide variety of noises, such as sounds from various devices in the vicinity of the speaker, general background sounds such as those in a moving car or background babble in crowded places, or even competing speakers. It may also be affected by reverberation, caused by sound reflections in the recording environment. And, of course, all of the above may occur concurrently in myriad combinations and, just to make matters more interesting, may change unpredictably over time.
For speech-recognition systems to perform acceptably, they must be robust to the distorting influences. This book deals with techniques that impart such robustness to ASR systems. We present a collection of articles from experts in the field, which describe an array of strategies that operate at various stages of processing in an ASR system. They range from techniques for minimizing the effect of external noises at the point of signal capture, to methods of deriving features from the signal that are fundamentally robust to signal degradation, techniques for attenuating the effect of external noises on the signal, and methods for modifying the recognition system itself to recognize degraded speech better.
The selection of techniques described in this book is intended to cover the range of approaches that are currently considered state of the art. Many of these approaches continue to evolve, nevertheless we believe that for a practitioner of the field to follow these developments, he must be familiar with the fundamental principles involved. The articles in this book are designed and edited to adequately present these fundamental principles. They are intended to be easy to understand, and sufficiently tutorial for the reader to be able to implement the described techniques.
Foundations.
The Basics of Automatic Speech Recognition.
The Problem of Robustness in Automatic Speech Recognition.
Signal Enhancement.
Voice Activity Detection, Noise Estimation, and Adaptive Filters for Acoustic Signal Enhancement.
Extraction of Speech from Mixture Signals.
Microphone Arrays.
Feature Enhancement.
From Signals to Speech Features by Digital Signal Processing.
Features Based on Auditory Physiology and Perception.
Feature Compensation.
Reverberant Speech Recognition.
Model Enhancement.
Adaptation and Discriminative Training of Acoustic Models.
Factorial Models for Noise Robust Speech Recognition.
Acoustic Model Training for Robust Speech Recognition.
Compensation for Information Loss.
Missing-Data Techniques: Recognition with Incomplete Spectrograms.
Missing-Data Techniques: Feature Reconstruction.
Computational Auditory Scene Analysis and Automatic Speech Recognition.
Uncertainty Decoding

  User comments    Sort newest first

No comments have been posted yet.



Post anonymous comment
  • Comments need intelligible text (not only emojis or meaningless drivel).
  • No upload requests, visit the forum or message the uploader for this.
  • Use common sense and try to stay on topic.

  • :) :( :D :P :-) B) 8o :? 8) ;) :-* :-( :| O:-D Party Pirates Yuk Facepalm :-@ :o) Pacman Shit Alien eyes Ass Warn Help Bad Love Joystick Boom Eggplant Floppy TV Ghost Note Msg


    CAPTCHA Image 

    Anonymous comments have a moderation delay and show up after 15 minutes