The AI Concepts Podcast is my attempt to turn the complex world of artificial intelligence into bite-sized, easy-to-digest episodes. Imagine a space where you can pick any AI topic and immediately grasp it, like flipping through an Audio Lexicon - but even better! Using vivid analogies and storytelling, I guide you through intricate ideas, helping you create mental images that stick. Whether you’re a tech enthusiast, business leader, technologist or just curious, my episodes bridge the gap between cutting-edge AI and everyday understanding. Dive in and let your imagination bring these concepts to life!
This episode addresses how Reinforcement Learning from Human Feedback (RLHF) adds the final layer of alignment after supervised fine-tuning, shifting the training signal from “right vs wrong” to “better vs worse.” We explore how preference rankings create a reward signal (reward models plus PPO) and the newer shortcut (DPO) that learns preferences directly, then connect RLHF to safety through the Helpful, Honest, Harmless goal. We ...
This episode addresses how we turn a raw base model into something that behaves like a real assistant using Supervised Fine-Tuning (SFT). We explore instruction and response training data, why SFT makes behaviors consistent beyond prompting, and the practical engineering choices that keep fine-tuning efficient and safe, including low learning rates and LoRA-style adapters. By the end, you will understand what SFT solves, and why th...
This episode addresses the physical and mathematical limits of a model’s "short-term memory." We explore the context window and the engineering trade-offs required to process long documents. You will learn about the quadratic cost of attention where doubling the input length quadruples the computational work and why this creates a massive bottleneck for long-form reasoning. We also introduce the architectural tricks like Flash Atte...
This episode explores the foundational stage of creating an LLM known as the pre-training phase. We break down the Trillion Token Diet by explaining how models move from random weights to sophisticated world models through the simple objective of next token prediction. You will learn about the Chinchilla Scaling Laws or the mathematical relationship between model size and data volume. We also discuss why the industry shifted from b...
Shay explains where a transformer actually stores knowledge: not in attention, but in the MLP (feed-forward) layer. The episode frames the transformer block as a two-step loop: attention moves information between tokens, then the MLP transforms each token’s representation independently to inject learned knowledge.
Shay breaks down the encoder vs decoder split in transformers: encoders (BERT) read the full text with bidirectional attention to understand meaning, while decoders (GPT) generate text one token at a time using causal attention.
She ties the architecture to training (masked-word prediction vs next-token prediction), explains why decoder-only models dominate today (they can both interpret prompts and generate efficiently with KV cac...
Shay explains multi-head attention and positional encodings: how transformers run multiple parallel attention 'heads' that specialize, why we concatenate their outputs, and how positional encodings reintroduce word order into parallel processing.
The episode uses clear analogies (lawyer, engineer, accountant), highlights GPU efficiency, and previews the next episode on encoder vs decoder architectures.
In this episode, Shay walks through the transformer's attention mechanism in plain terms: how token embeddings are projected into queries, keys, and values; how dot products measure similarity; why scaling and softmax produce stable weights; and how weighted sums create context-enriched token vectors.
The episode previews multi-head attention (multiple perspectives in parallel) and ends with a short encouragement to take a small st...
Shay breaks down the 2017 paper "Attention Is All You Need" and introduces the transformer: a non-recurrent architecture that uses self-attention to process entire sequences in parallel.
The episode explains positional encoding, how self-attention creates context-aware token representations, the three key advantages over RNNs (parallelization, global receptive field, and precise signal mixing), the quadratic computational trade-off...
Shay breaks down why recurrent neural networks (RNNs) struggled with long-range dependencies in language: fixed-size hidden states and the vanishing gradient caused models to forget early context in long texts.
He explains how LSTMs added gates (forget, input, output) to manage memory and improve short-term performance but remained serial, creating a training and scaling bottleneck that prevented using massive parallel compute.
The...
This episode dives into the hidden layer where language stops being words and becomes numbers. We explore what tokens actually are, how tokenization breaks text into meaningful fragments, and why this design choice quietly shapes a model’s strengths, limits, and quirks. Once you understand tokens, you start seeing why language models sometimes feel brilliant and sometimes strangely blind.
This episode explores the hidden engine behind how language models move from knowing to creating. It reveals why generation happens step by step, why speed has hard limits, and why training and usage behave so differently. Once you see this mechanism, the way models write, reason, and sometimes stall will make immediate sense.
This episode is about the hidden space where generative models organize meaning. We move from raw data into a compressed representation that captures concepts rather than pixels or tokens, and we explore how models learn to navigate that space to create realistic outputs. Understanding this idea explains both the power of generative AI and why it sometimes fails in surprising ways.
Welcome to Episode One of The Generative Shift. This episode introduces the core change behind modern AI, the move from discriminative models that draw decision boundaries to generative models that learn the full structure of data. Instead of predicting labels using conditional probability, generative systems model the joint distribution itself, which allows them to create rather than classify. This shift reshapes the math, the arc...
Hello everyone, and welcome to The Generative AI Series. I’m Shay, and this introductory episode is about why this series exists and who it is for. Generative AI has exploded, but real understanding is still scattered. Between hype, shortcuts, and surface level strategy talk, it is hard to find a clear path from fundamentals to building systems that actually work. This series is for practitioners, builders, architects, and technica...
Welcome to the final episode of our Deep Learning series on the AI Concepts Podcast. In this episode, host Shay takes you on a journey through the world of autoencoders, a foundational AI model. Unlike traditional models that predict or label, autoencoders excel in understanding and reconstructing data by learning to compress information. Discover how this quiet revolution in AI powers features like image enhancement and noise-canc...
Welcome to the AI Concepts Podcast, where we explore AI, one concept at a time. In this episode, host Shay delves into the transformative world of transformers in AI, focusing on how they have revolutionized language understanding and generation. Discover how transformers enable models like ChatGPT to respond thoughtfully and coherently, transforming inputs into conversational outputs with unprecedented accuracy. The discussion unv...
In this episode of the AI Concepts Podcast, host Shay delves into the transformation of deep learning architectures, highlighting the limitations of RNNs, LSTM, and GRU models when handling sequence processing and long-range dependencies. The breakthrough discussed is the attention mechanism, which allows models to dynamically focus on relevant parts of input, improving efficiency and contextual awareness.
Shay unpacks the process ...
Welcome to another episode of the AI Concepts Podcast, where we simplify complex AI topics into digestible explanations. This episode continues our Deep Learning series, diving into the limitations of Recurrent Neural Networks (RNNs) and introducing their game-changing successors: Long Short-Term Memory (LSTM) networks and Gated Recurrent Units (GRUs). Learn how these architectures revolutionize tasks with long-term dependencies by...
Welcome to the AI Concepts Podcast! In this episode, we dive into the fascinating world of Recurrent Neural Networks (RNNs) and how they revolutionize the processing of sequential data. Unlike models you've heard about in previous episodes, RNNs provide the capability to remember context over time, making them essential for tasks involving language, music, and time series predictions. Using analogies and examples, we delve into the...
Two Guys (Bowen Yang and Matt Rogers). Five Rings (you know, from the Olympics logo). One essential podcast for the 2026 Milan-Cortina Winter Olympics. Bowen Yang (SNL, Wicked) and Matt Rogers (Palm Royale, No Good Deed) of Las Culturistas are back for a second season of Two Guys, Five Rings, a collaboration with NBC Sports and iHeartRadio. In this 15-episode event, Bowen and Matt discuss the top storylines, obsess over Italian culture, and find out what really goes on in the Olympic Village.
The 2026 Winter Olympics in Milan Cortina are here and have everyone talking. iHeartPodcasts is buzzing with content in honor of the XXV Winter Olympics We’re bringing you episodes from a variety of iHeartPodcast shows to help you keep up with the action. Follow Milan Cortina Winter Olympics so you don’t miss any coverage of the 2026 Winter Olympics, and if you like what you hear, be sure to follow each Podcast in the feed for more great content from iHeartPodcasts.
Listen to the latest news from the 2026 Winter Olympics.
If you've ever wanted to know about champagne, satanism, the Stonewall Uprising, chaos theory, LSD, El Nino, true crime and Rosa Parks, then look no further. Josh and Chuck have you covered.
Current and classic episodes, featuring compelling true-crime mysteries, powerful documentaries and in-depth investigations. Follow now to get the latest episodes of Dateline NBC completely free, or subscribe to Dateline Premium for ad-free listening and exclusive bonus content: DatelinePremium.com