New Arrivals/Restock

Building Speech AI: A Practitioner’s Guide to Speech Recognition, Synthesis, and Audio Language Models with Python

flash sale iconLimited Time Sale
Until the end
09
15
23

US$20.60 cheaper than the new price!!

Free shipping for purchases over $99 ( Details )
Free cash-on-delivery fees for purchases over $99
Please note that the sales price and tax displayed may differ between online and in-store. Also, the product may be out of stock in-store.
Used  US$13.74
quantity

Product details

Management number 231601648 Release Date 2026/06/18 List Price US$13.74 Model Number 231601648
Category

Speech is the next frontier of AI. Are you ready to build it?The next generation of AI will not only read and write. It will listen, speak, interrupt, clarify, and respond in real time. For decades, the default human interface to computers has been text — searches and commands typed through a keyboard. Conversational AI, real-time voice agents, and speech-first interfaces are moving interaction from the keyboard to the microphone. Speech is more natural than text, and often several times faster for conveying the same information. The gap between an impressive demo and a reliable production system is where most teams stall.Building Speech AI bridges that gap with the first comprehensive, practitioner-focused guide to the modern speech and audio AI stack. Written by an AI scientist with two decades of experience building production speech systems at leading labs, it connects acoustic fundamentals to modern Transformer architectures, foundation models such as Whisper, wav2vec 2.0, VITS, and AudioLM, and real-world deployment — with working Python code at every step.What you'll learnHow speech is represented, modeled, and generated end to end — from sound waves, frequency, amplitude, and spectrograms to neural audio representationsThe foundations of digital signal processing for speech AI, including sampling, framing, filtering, feature extraction, and acoustic modelingThe evolution of speech recognition systems from HMM-GMM pipelines to modern end-to-end neural architecturesThe architectures behind automatic speech recognition, text-to-speech, and Transformer-based audio systems, including Whisper, wav2vec 2.0, Conformer, WaveNet, FastSpeech, VITS, and diffusion-based modelsHow self-supervised learning, audio embeddings, and foundation models are reshaping voice interfaces, search, similarity, and multimodal AI systemsHow to build and evaluate real-time voice agents that can listen, respond, clarify, interrupt, and operate under production constraintsPractical engineering trade-offs for latency, streaming, robustness, edge deployment, scaling, monitoring, and model evaluationSpeaker recognition, speaker diarization, emotion detection, voice conversion, speech enhancement, and voice cloningThe ethics of synthesis: deepfakes, privacy, consent, bias, safety, voice cloning, and trustworthy speech AIWho this book is forMachine-learning engineers, AI engineers, software developers shipping voice into products, researchers pushing the state of the art, product managers shaping voice-enabled experiences, executives setting AI strategy, and technical leaders who need to understand how the human interface is shifting — and decide the voice strategy for what comes next.It is for anyone who needs to understand speech AI deeply enough to build it, ship it, evaluate it, or define the voice strategy for what comes next. Whether you are transcribing audio with Whisper, designing an end-to-end voice assistant, building audio embeddings for search, or architecting scalable speech pipelines, this book provides the conceptual depth and engineering clarity to help you build systems that listen — and speak — intelligently.Hands-on companion repository with runnable Jupyter notebooks for every chapter at prdeepakbabu.github.io/building-speech-ai. Read more


Correction of product information

If you notice any omissions or errors in the product information on this page, please use the correction request form below.

Correction Request Form

Product Review

You must be logged in to post a review