Home > Science, Technology & Agriculture > Technology: general issues > Audio Segmentation for Meetings Speech Processing: (English)

Audio Segmentation for Meetings Speech Processing: (English) (Paperback) | Released: 01 Sep 2011

Name: Audio Segmentation for Meetings Speech Processing: (English)
Brand: Bookswagon
Price: 5948.34 INR
Availability: OutOfStock

By: Kofi Agyeman Boakye (Author) | Publisher: Proquest, Umi Dissertation Publishing | Publisher Imprint: Proquest, Umi Dissertation Publishing

Write Reviews

₹5,948

Out of Stock

ISBN-10

1243991615

ISBN-13

9781243991614

Page Number

168

Language

English

Imprint

Proquest, Umi Dissertation Publishing

Weight (gr)

313

Dimention(mm)

246x9x189

See all details

Premium quality

Bookswagon upholds the quality by delivering untarnished books. Quality, services and satisfaction are everything for us!

Easy Return

Easy return

Not satisfied with this product! Keep it in original condition and packaging to avail easy return policy.

Certified product

First impression is the last impression! Address the book’s certification page, ISBN, publisher’s name, copyright page and print quality.

Secure Checkout

Secure checkout

Security at its finest! Login, browse, purchase and pay, every step is safe and secured.

Money back guarantee

Money-back guarantee:

It’s all about customers! For any kind of bad experience with the product, get your actual amount back after returning the product.

On time delivery

On-time delivery

At your doorstep on time! Get this book delivered without any delay.

Notify me when this book is in stock

Add to Wishlist

About the Book

Perhaps more than any other domain, meetings represent a rich source of content for spoken language research and technology. Two common (and complementary) forms of meeting speech processing are automatic speech recognition (ASR)---which seeks to determine what was said---and speaker diarization---which seeks to determine who spoke when. Because of the complexity of meetings, however, such forms of processing present a number of challenges. In the case of speech recognition, crosstalk speech is often the primary source of errors for audio from the personal microphones worn by participants in the various meetings. This crosstalk typically produces insertion errors in the recognizer, which mistakenly processes this non-local speech audio. With speaker diarization, overlapped speech generates a significant number of errors for most state-of-the-art systems, which are generally unequipped to deal with this phenomenon. These errors appear in the form of missed speech, where overlap segments are not identified, and increased speaker error from speaker models negatively affected by the overlapped speech data. This thesis sought to address these issues by appropriately employing audio segmentation as a first step to both automatic speech recognition and speaker diarization in meetings. For ASR, the segmentation of nonspeech and local speech was the objective while for speaker diarization, nonspeech, single-speaker speech, and overlapped speech were the audio classes to be segmented. A major focus was the identification of features suited to segmenting these audio classes: For crosstalk, cross-channel features were explored, while for monaural overlapped speech, energy, harmonic, and spectral features were examined. Using feature subset selection, the best combination of auxiliary features to baseline MFCCs in the former scenario consisted of normalized maximum cross-channel correlation and log-energy difference; for the latter scenario, RMS energy, harmonic energy ratio, and modulation spectrogram features were determined to be the most useful in the realistic multi-site farfield audio condition. For ASR, improvements to word error rate of 13.4% relative were made to the baseline on development data and 9.2% relative on validation data. For speaker diarization, results proved less consistent, with relative DER improvements of 23.25% on development, but no significant change on a randomly selected validation set. Closer inspection revealed performance variability on the meeting level, with some meetings improving substantially and others degrading Further analysis over a large set of meetings confirmed this variability, but also showed many meetings benefitting significantly from the proposed technique.

Best Seller

| | See All

Product Details

ISBN-13: 9781243991614
Publisher: Proquest, Umi Dissertation Publishing
Publisher Imprint: Proquest, Umi Dissertation Publishing
Height: 246 mm
No of Pages: 168
Series Title: English
Weight: 313 gr

ISBN-10: 1243991615
Publisher Date: 01 Sep 2011
Binding: Paperback
Language: English
Returnable: N
Spine Width: 9 mm
Width: 189 mm

Related Categories

Technology, Engineering, Agriculture, Industrial processes > Technology: general issues

Similar Products

28%

Very poor	Poor	Neutral	Good	Great

Share this product

Audio Segmentation for Meetings Speech Processing: (English) (Paperback) | Released: 01 Sep 2011

Premium quality

Easy return

Certified product

Secure checkout

Money-back guarantee:

On-time delivery

Best Seller

Similar Products

How would you rate your experience shopping for books on Bookswagon?

Thank you for your rating!

Customer Reviews

Audio Segmentation for Meetings Speech Processing: (English)

New Arrivals

Inspired by your browsing history