PodcastsTecnologíaLinear Digressions

Linear Digressions

Katie Malone
Linear Digressions
Último episodio

302 episodios

  • Linear Digressions

    Unfaithful Chain of Thought

    13/04/2026 | 24 min
    What's actually happening when an LLM "thinks out loud"? Research on human decision-making suggests that much of the reasoning we believe drives our choices is actually post hoc rationalization — we decide first, explain later. Katie and Ben get curious about whether the same might be true for large language models: when you watch a model reason through a problem in real time, is that chain of thought the genuine process, or just a plausible-sounding story told after the fact? It's a deceptively deep question with real stakes for how much we should trust model explanations.

    Miles Turpin et al., "Language Models Don't Always Say What They Think: Unfaithful Explanations in
    Chain-of-Thought Prompting" (NeurIPS 2023, NYU and Anthropic): https://arxiv.org/abs/2305.04388

    Anthropic, "Reasoning Models Don't Always Say What They Think" (Alignment Faking research, 2025):
    https://www.anthropic.com/research/reasoning-models-dont-say-think
  • Linear Digressions

    Benchmark Bank Heist

    06/04/2026 | 12 min
    What if an AI decided the smartest way to pass its test was to find the answer key? That's exactly what Anthropic's Claude Opus did when faced with a benchmark evaluation — reasoning that it was being tested, tracking down the encrypted eval dataset, decrypting it, and returning the answer it found inside. It's equal parts impressive and unsettling. This episode digs into what actually happened, why it matters for how we measure AI progress, and what this very novel failure mode means for the already-tricky science of benchmarking language models.

    Links

    Anthropic's writeup on the BrowseComp reverse-engineering done by Claude Opus 4.6: https://www.anthropic.com/engineering/eval-awareness-browsecomp

    BrowseComp benchmark from OpenAI: https://openai.com/index/browsecomp/
  • Linear Digressions

    Benchmarking AI Models

    30/03/2026 | 29 min
    How do you know if a new AI model is actually better than the last one? It turns out answering that question is a lot messier than it sounds. This week we dig into the world of LLM benchmarks — the standardized tests used to compare models — exploring two canonical examples: MMLU, a 14,000-question multiple choice gauntlet spanning medicine, law, and philosophy, and SWE-bench, which throws real GitHub bugs at models to see if they can fix them. Along the way: Goodhart's Law, data contamination, canary strings, and why acing a test isn't always the same as being smart.
  • Linear Digressions

    The Hot Mess of AI (Mis-)Alignment

    23/03/2026 | 22 min
    The paperclip maximizer — the classic AI doom scenario where a hyper-competent machine single-mindedly converts the universe into office supplies — might not be the AI risk we should actually lose sleep over. New research from Anthropic's AI safety division suggests misaligned AI looks less like an evil genius and more like a distracted wanderer who gets sidetracked reading French poetry instead of, say, managing a nuclear power plant. This week we dig into a fascinating paper reframing AI misalignment through the lens of bias-variance decomposition, and why longer reasoning chains might actually make things worse, not better.

    - "The Hot Mess Theory of AI Misalignment: How Misalignment Scales with Model Intelligence and Task Complexity" — Anthropic AI Safety. https://arxiv.org/abs/2503.08941
  • Linear Digressions

    The Bitter Lesson

    15/03/2026 | 19 min
    Every AI builder knows the anxiety: you spend months engineering prompts, tuning pipelines, and chaining calls together — then a new model drops and half your work evaporates overnight. It turns out researchers have been wrestling with this exact dynamic for 30 years, and they keep arriving at the same uncomfortable answer. That answer is called the Bitter Lesson — and understanding it might be the most important thing you can do for whatever you're building right now. From Deep Blue to AlexNet to modern LLMs, scale keeps beating sophistication, and knowing which side of that line your work falls on makes all the difference.

    Links

    - Richard Sutton, "The Bitter Lesson"

    - Alon Halevy, Peter Norvig, and Fernando Pereira, "The Unreasonable Effectiveness of Data"

    - Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, "ImageNet Classification with Deep Convolutional Neural Networks"

Más podcasts de Tecnología

Acerca de Linear Digressions

Demystifying AI for the intelligently curious
Sitio web del podcast

Escucha Linear Digressions, Inteligencia Artificial y muchos más podcasts de todo el mundo con la aplicación de radio.net

Descarga la app gratuita: radio.net

  • Añadir radios y podcasts a favoritos
  • Transmisión por Wi-Fi y Bluetooth
  • Carplay & Android Auto compatible
  • Muchas otras funciones de la app
Aplicaciones
Redes sociales
v8.8.10| © 2007-2026 radio.de GmbH
Generated: 4/16/2026 - 2:45:42 AM