← All programs
NLP

Speech and Audio AI

ASR, TTS, and audio understanding

4.7· 248 student reviews
Level: AdvancedDuration: 6 monthsCredits: 21Tuition: $699 CADLead instructor: Dr. Pavel Marek

About this program

Six months on speech and audio AI. Cover ASR architectures (Whisper, Conformer), TTS (VITS, modern neural TTS), speaker diarization, audio event detection, and modern multimodal audio-language models. Includes a French/English bilingual capstone.

Student ratings

Excellent — 248 verified Canadian graduates rated this program 4.7/5. Reviews emphasize the applied capstone, instructor responsiveness, and career outcomes.

4.7
248 reviews
  • 5
    185
  • 4
    56
  • 3
    5
  • 2
    1
  • 1
    1

Who this program is for

  • Practitioners already shipping nlp work who want depth
  • Senior engineers, data scientists, and technical leads
  • Canadian residents seeking a verifiable diploma credential

Topics you'll cover

6 modules across 6 months — 24 lessons in total.

01Month 1 — Audio Foundations02Month 2 — ASR03Month 3 — TTS04Month 4 — Diarization and Detection05Month 5 — Audio-Language Models06Month 6 — Capstone

Six-month syllabus

Module 1 · Month 1 — Audio Foundations
  • L1Signal processing primer
  • L2Features: MFCCs, mel spectrograms
  • L3Augmentations
  • L4Lab: audio pipeline
Module 2 · Month 2 — ASR
  • L1CTC, transducer, attention
  • L2Whisper family
  • L3Fine-tuning
  • L4Lab: domain ASR
Module 3 · Month 3 — TTS
  • L1Vocoders
  • L2VITS and modern neural TTS
  • L3Voice cloning ethics
  • L4Lab: bilingual TTS
Module 4 · Month 4 — Diarization and Detection
  • L1Speaker diarization
  • L2VAD
  • L3Audio event detection
  • L4Lab: meeting transcription
Module 5 · Month 5 — Audio-Language Models
  • L1AudioPaLM and successors
  • L2Speech LLMs
  • L3Trends
  • L4Limitations
Module 6 · Month 6 — Capstone
  • L1Bilingual EN/FR project
  • L2Build, evaluate, deploy
  • L3Documentation
  • L4Final demo

What you'll be able to do

  • Train and deploy ASR systems
  • Build TTS pipelines
  • Implement speaker diarization
  • Use audio-language models
  • Handle Canadian bilingual deployments

Career paths after graduation

Role 1
NLP Engineer
Role 2
Conversational AI Designer
Role 3
Language Model Specialist

Frequently asked questions

How much does the Speech and Audio AI cost?

Tuition is $699 CAD. You can pay in full at checkout or choose an interest-free monthly plan. A 30-day refund window applies from your start date.

How long is the Speech and Audio AI program?

6 months, cohort-based and fully online. Expect roughly 13 hours per week including live Thursday sessions at 7pm ET.

What are the prerequisites?

Python; Deep learning fundamentals

Is the diploma recognized in Canada?

Yes. Graduates receive the Altaris AI Academy Diploma in NLP — a verifiable credential with a unique certificate number you can publish on LinkedIn and that any employer can verify at smart-ai-future.lovable.app/verify.

What is the refund policy?

Full refund within 30 days of your cohort start date, no questions asked. After day 30, prorated refunds are available per our Refund Policy.

Who teaches the program?

Working Canadian AI practitioners — not academics. Each cohort has a lead instructor plus a 1:1 mentor pairing for the duration of the program.

Students also enrolled in

More NLP programs from Altaris.