PodcastsCienciasThe TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Sam Charrington
The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Último episodio

788 episodios

  • The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

    Is RAG Dead? Lessons from Building AI for Tax Law with Alex Bowcut - #769

    09/06/2026 | 51 min
    As context windows grow into the millions of tokens, many AI practitioners are questioning whether retrieval-augmented generation (RAG) is still necessary. If modern models can ingest entire libraries of documents, why bother with retrieval at all?

    In this episode, Alex Bowcut, Head of Engineering at Sphere, explains why the answer depends on the application. Sphere uses AI to automate global tax compliance—an environment where getting the answer right isn’t enough. Every conclusion must be backed by the correct legal citation, and every decision must withstand expert review.

    We explore how Sphere built TRAM (Tax Review and Assessment Model), a production AI system that combines retrieval, reasoning models, legal review workflows, reinforcement learning, and deterministic systems to help tax experts move nearly two orders of magnitude faster while maintaining accuracy.

    Along the way, we discuss why RAG remains critical in high-stakes domains, how Sphere processes legal and regulatory documents from jurisdictions around the world, retrieval architectures, semantic chunking, dense versus sparse retrieval, expert feedback loops, and the challenges of building AI systems that people can actually trust.

    🗒️  Full show notes: https://twimlai.com/go/769.
  • The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

    Relational Foundation Models for Enterprise Data with Jure Leskovec - #768

    21/05/2026 | 1 h 6 min
    In this episode, Jure Leskovec, co-founder and chief scientist at Kumo and professor of computer science at Stanford, joins us to explore two fronts of his work: AI for science and relational deep learning. We begin with AI Virtual Cell, a multiscale effort to learn data-driven representations from proteins to cells to patients using single-cell RNA-seq data, protein language models like ESM, and structure models like AlphaFold—without hand-encoding biology. Jure then dives into relational deep learning, reframing enterprise databases as graphs and training neural networks directly on raw multi-table data. He explains Kumo’s Relational Foundation Model (RFM2), which performs in-context learning over subgraphs to make accurate predictions on new databases and tasks with no training, and how this approach benchmarks against RelBench and other multi-table datasets. We also discuss real-world deployments at companies like Reddit, DoorDash, and Coinbase, explainability via attention over tables and columns, integration with agentic systems, deployment options, and practical limitations.

    The complete show notes for this episode can be found at https://twimlai.com/go/768.
  • The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

    How to Find the Agent Failures Your Evals Miss with Scott Clark - #767

    07/05/2026 | 53 min
    In this episode, Scott Clark, co-founder and CEO of Distributional, joins us to explore how teams can reliably operate and improve complex LLM systems and agents in production. Scott introduces a Maslow’s hierarchy of observability: telemetry for logging, monitoring for known signals, and post-production or online analytics to surface unknown unknowns. We dig into examples of real-world failures Scott’s team has seen in production systems, such as “lazy” tool-use hallucinations that standard evals miss, and how mapping traces into vector fingerprints enables clustering and topic discovery to uncover emergent behaviors. Scott explains how analytics can feed the data flywheel by generating evals, guardrails, and training data, and why online, adaptive approaches are essential for non-stationary models. We also touch on practical how-to’s such as instrumentation with OpenTelemetry, the GenAI semantic conventions, and the role of dedicated analytics tools.

    The complete show notes for this episode can be found at https://twimlai.com/go/767.
  • The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

    How to Engineer AI Inference Systems with Philip Kiely - #766

    30/04/2026 | 54 min
    In this episode, Philip Kiely, head of AI education at Baseten, joins us to unpack the fast-evolving discipline of inference engineering. We explore why inference has become the stickiest and most critical workload in AI, how it blends GPU programming, applied research, and large-scale distributed systems, and where the line sits between inference and model serving. Philip shares how research-to-production can move in hours, not months, and why understanding “the knobs” of inference—batching, quantization, speculation, and KV cache reuse—lets teams design better products and SLAs. We trace the inference maturity journey from closed APIs to dedicated deployments and in-house platforms, discuss GPU lifecycles, and survey today’s runtime landscape, including vLLM, SGLang, and TensorRT LLM. Finally, we look ahead to agents and multimodality, making the case for specialized, workload-specific runtimes when performance and efficiency matter most.

    The complete show notes for this episode can be found at https://twimlai.com/go/766.
  • The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

    How Capital One Delivers Multi-Agent Systems with Rashmi Shetty - #765

    16/04/2026 | 54 min
    In this episode, Rashmi Shetty, senior director of enterprise generative AI platform at Capital One, joins us to explore how the company is designing, deploying, and scaling multi-agent systems in a highly regulated environment. Rashmi walks us through Chat Concierge, a multi-agent chat experience for auto dealerships that handles intent disambiguation, tool invocation, and human handoffs to deliver safer, more personalized customer journeys. We discuss Capital One’s platform-centric approach to AI agents and how it separates design from runtime governance, embedding policies, guardrails, and cyber controls across agent threat boundaries. Rashmi shares how the team approaches the developer experience for agent builders, observability, and evals for stochastic, multi-agent workflows; and strategies for model specialization, including fine-tuning and distillation. We also cover standards and abstraction, closed-loop learning from production telemetry, and key lessons for enterprises building agentic systems.

    The complete show notes for this episode can be found at https://twimlai.com/go/765.
Más podcasts de Ciencias
Acerca de The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)
Machine learning and artificial intelligence are dramatically changing the way businesses operate and people live. The TWIML AI Podcast brings the top minds and ideas from the world of ML and AI to a broad and influential community of ML/AI researchers, data scientists, engineers and tech-savvy business and IT leaders. Hosted by Sam Charrington, a sought after industry analyst, speaker, commentator and thought leader. Technologies covered include machine learning, artificial intelligence, deep learning, natural language processing, neural networks, analytics, computer science, data science and more.
Sitio web del podcast

Escucha The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence), Jefillysh: Ciencia Simplificada y muchos más podcasts de todo el mundo con la aplicación de radio.net

Descarga la app gratuita: radio.net

  • Añadir radios y podcasts a favoritos
  • Transmisión por Wi-Fi y Bluetooth
  • Carplay & Android Auto compatible
  • Muchas otras funciones de la app