AI+ Audio™

Price
Net: 397,00
VAT.: 75,43

Price
Net: 397,00
VAT.: 75,43

Price
Net: 397,00
VAT.: 75,43

Price
Net: 397,00
VAT.: 75,43

Price
Net: 397,00
VAT.: 75,43

Price
Net: 397,00
VAT.: 75,43

Price
Net: 397,00
VAT.: 75,43

Price
Net: 397,00
VAT.: 75,43

Price
Net: 397,00
VAT.: 75,43

Price
Net: 397,00
VAT.: 75,43

Price
Net: 397,00
VAT.: 75,43

Price
Net: 397,00
VAT.: 75,43

Duration
1 day

For companies and job seekers:
this course is 100% fundable!
 

Location

Course Language
English

Training Solutions
WalkIn®

A new era of digital soundscapes is emerging, characterized by AI-supported production and smart audio workflows. Modern tools transform ideas into precise sound solutions and create space for automated processes, creative work, and high-quality results.

Key topics

  • AI-based audio processing and voice modeling.
  • Automated sound generation and mixing.
  • Workflow optimization through intelligent tools.
  • Trends in synthetic voices, audio agents, and adaptive sound design.

Prerequisites
Basic understanding of digital media and interest in AI-supported audio.

Target audience
Professionals from the media, content, technology, and creative industries, as well as individuals who want to make targeted use of AI sound technologies.

A future-oriented way to design modern audio ecosystems, set new production standards, and use AI for precise, scalable sound solutions.
 

Print as PDF
course content
  • What is AI?
  • AI in everyday life: Audio examples
  • Basics of sound waves, amplitude, and frequency
  • Basics of digital audio technology
  • AI for audio enhancement and restoration
  • AI for audio accessibility and personalization
  • AI in speech and voice technologies
  • Popular audio libraries: Librosa, PyAudio
  • Use case: AI-powered real-time captioning and translation for live events
  • Case study: Personalized hearing aid fitting using AI and smart earbuds
  • Practical application: Recognizing emotions in the voice using Deepgram's speech AI platform
  • Machine learning models for audio applications
  • Deep learning and advanced AI techniques for audio
  • Audio-specific architectures: CNNs, RNNs, transformers
  • Transfer learning in audio AI
  • Use case: Speech-to-text transcription for medical records
  • Case study: AI-assisted music generation with deep learning
  • Practical exercise: Creating a speech-to-text model with TensorFlow
  • Fundamentals of speech recognition and phonetics
  • API-based ASR solutions
  • Creating custom ASR models with transformers
  • Introduction to TTS and voice cloning
  • Use case: Automating meeting minutes with the Google Speech-to-Text API
  • Case study: Custom Transformer-based ASR model for multilingual customer support
  • Practical exercise: Transcribing audio with an ASR API; generating speech from text
  • Common audio problems
  • AI-based noise filtering and enhancement
  • Use cases: Improving audio quality for remote work calls using AI noise reduction
  • Case study: Krisp's AI-powered noise reduction in podcast production
  • Practical application: Clean up noisy audio with Krisp or Adobe Enhance Speech.
  • Introduction to emotion recognition
  • AI models for emotion recognition: RNNs, LSTMs, CNNs
  • Challenges: Bias, multilingual contexts, reliability
  • Use case: Improving customer service through emotion recognition from speech
  • Case study: IBM Watson Tone Analyzer for real-time emotion recognition
  • Practical exercise: Analysis of speech samples with IBM Watson Tone Analyzer or similar APIs.
  • Risks posed by deepfakes and voice cloning
  • Data protection and data security
  • Bias and fairness in audio AI
  • Use case: Implementing ethical voice data collection and consent management
  • Case study: Dealing with bias and data protection in audio AI in compliance with the GDPR
  • Practical exercise: Detecting fake audio clips; creating a checklist for ethical AI
  • Noise detection and classification
  • Audio search and indexing
  • Innovations: multimodal AI, edge computing, 3D audio
  • New professions in the field of audio AI

Frequently asked questions

  • Practical knowledge about voice AI, speech synthesis, AI-supported audio processing, and sound production is conveyed in a compact, understandable, and future-proof manner.
  • Ideal for media professionals, content creators, marketing teams, or creatives who want to use AI for audio production—without a recording studio or specialized knowledge.
  • It guides you step by step through tools, basics, and workflows. Complex terms are explained in an understandable way. No prior knowledge is necessary—just get started.
  • Focus on creative practice rather than technical theory. Combines expertise in AI, audio, and communication with tools that can be used immediately.
  • Modern AI tools such as TensorFlow Audio, OpenAI Jukebox, Google Magenta, AIVA, Wav2Vec, SpeechBrain, and AudioLDM are used. These are supplemented by plugins and software such as Adobe Podcast AI, Audacity AI, FL Studio, Logic Pro, and Spotify Audio Analysis.
  • Automated speech production saves time and money, reduces production costs, and enables new formats—such as multilingual content at the touch of a button.
  • AI audio optimizes presentations, campaigns, training sessions, and podcasts—anywhere where people speak, smart technology increases the impact.
  • Artificial intelligence will continue to change voice, music, and sound. Those who understand the basics will remain flexible, innovative, and technologically compatible.

Do you have any further questions? Please contact us.