Talks, Presentations, and Podcasts

Andrew A Lamb

Talks and Presentations

2025-07-10 Introduction to Iceberg / Parquet Variant NYC Apache Iceberg™ Community Meetup ( slides slides recording YouTube )

2025-06-09 Intro to Apache DataFusion: Technology, Community, and Not Quite Enough Time San Francisco Apache DataFusion Meetup ( slides slides recording YouTube )

2025-06-09 Accelerating Apache Parquet with metadata stores and specialized indexes using Apache DataFusion Open Lakehouse Mini Summit ( slides slides recording YouTube )

2025-04-18 Cloud Data Lakes, Guest Lecture Boston University, Data Systems Architectures CS-561 ( slides slides slides (pdf) pdf recording YouTube )

2025-01-23 Intro to DataFusion: Technology, Community, and Not Quite Enough Time
Xomnia Data & Drinks: Building Next-Gen Data Systems with Apache DataFusion ( slides slides recording YouTube )

2025-01-15 Intro to DataFusion: Technology, Community, and Not Quite Enough Time
Boston Apache Datafusion Meetup ( slides slides )

2024-12-18 Building InfluxDB 3.0 with DataFusion Chicago Apache Datafusion Meetup ( slides slides )

2024-11-25 Building InfluxDB 3.0 (and other systems) without starting from “scratch” with Apache DataFusion UC Berkeley Data Systems and Foundations (DSF) Seminar ( slides slides )

2024-10-28 Apache DataFusion: Design Choices when Building Modern Analytic Systems Boston University: MiDAS Fall 2024 (Data Systems Seminar) ( slides slides slides (PDF) pdf recording YouTube )

2024-09-27 Apache DataFusion: What, Why and How Belgrade Apache DataFusion Meetup ( slides slides recording YouTube )

2024-09-23 Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine (talk) Carnegie Mellon Univeristy: Database Building Blocks Seminar Series - Fall 2024 ( slides Slides recording YouTube )

2024-06-26 NYC Meetup New York City Apache DataFusion Meetup ( slides slides )

2024-06-26 Building InfluxDB 3.0 (and other systems) without starting from “scratch” with Apache DataFusion Microsoft Gray Systems Lab ( slides slides )

2024-06-25 DataFusion Meetup 2.0 - San Francisco San Francisco Bay Area Apache DataFusion Meetup ( slides slides )

2024-06-14 DataFusion: The Case for Building Open Data Systems (keynote) 2024 Simplicy in Management of Data (SiMOD) workshop ( slides slides )

2024-06-13 Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine (talk) 2024 ACM SIGMOD International Conference on Management of Data ( slides Slides recording YouTube ACM DOI PDF PDF) )

2023-05-09 Introduction to Apache Arrow and Apache Parquet, using Python and pyarrow (updated) ODSC East 2024 ( slides slides )

2024-03-27 Building InfluxDB 3.0 with Apache Arrow, DataFusion, Flight and Parquet. DataCouncil 2024 ( slides slides recording YouTube )

2024-03-27 Apache Arrow DataFusion Meetup: Introduction, Agenda, Remarks. ( slides slides recording YouTube )

2023-09-27 Implementing InfluxDB IOx, “from scratch” using Apache Arrow, DataFusion, and Rust. MIT Database Group (invited talk) ( slides Slides )

2023-06-02 Implementing InfluxDB IOx, “from scratch” using Apache Arrow, DataFusion, and Rust. Dutch Seminar on Database System Design ( slides Slides recording YouTube )

2023-05-09 Introduction to Apache Arrow and Apache Parquet, using Python and pyarrow. ODSC East 2023 ( slides Slides )

2023-04-05 The Apache Arrow DataFusion Architecture Part 3: Physical Plan and Execution. ( slides Slides recording YouTube )

2023-04-04 The Apache Arrow DataFusion Architecture Part 2: Logical Plans and Expressions. ( slides Slides recording YouTube )

2023-03-31 The Apache Arrow DataFusion Architecture Part 1: Query Engines.* ( slides Slides recording YouTube )

2023-02-15 Building a new time series database “from scratch” Using Apache Arrow, Parquet, DataFusion and Rust Invited Talk at Optum Labs ( slides Slides )

2022-06-27 DataFusion and Arrow: Supercharge Your Data Analytical Tool with a Rusty Query Engine. DataBricks Data+AI Summit ( slides Slides recording YouTube )

2022-05-23 Apache Arrow and DataFusion: Changing the Game for Implementing Database Systems. The Data Thread 2022 ( slides Slides recording YouTube )

2022-04-06 Managing Software Dependencies and the Supply Chain. EM.S20, Wrangling Software Engineering Projects MIT Sloan School of Management, Guest Speaker ( slides Slides )

2021-10-13 Query Processing in InfluxDB IOx InfluxData Tech Talk ( slides Slides recording YouTube )

2021-04-20 Apache Arrow and its impact on the database industry CSE-132 Database Systems Implementation University of Southern California, Guest Speaker ( slides Slides recording YouTube )

2021-03 Query Engine Design and the Rust-Based DataFusion in Apache Arrow. InfluxData Tech Talk ( slides Slides slides alt recording YouTube )

2020-12-09 A Rusty Introduction to Apache Arrow and how it applies to a TimeSeries Database. InfluxData Tech Talk ( slides recording YouTube )

2013-01-10 Tradeoffs in Massively Parallel Analytical Systems. MIT IAP Talk ( slides Slides )

Podcast Appearances

Why Rust Will Help You Deliver Better Low-latency Systems and Happier Developers InfoQ Podcast ( Podcast Podcast YouTube YouTube )

2025-04-25 DataFusion: A Database Building Toolkit Developer Voices with Kris Jenkins ( Podcast Podcast YouTube YouTube Spotify Spotify )

2024-08-31 Rebuilding InfluxDB with Rust with Andrew Lamb [The Rustacean Station Podcast] https://rustacean-station.org ( podcast Podcast mp3 mp3 )

2024-04-26 Modern OLAP Database System Design with FDAP (Andrew Lamb) The Geek Narrator ( YouTube YouTube )

2024-04-24 Open Source and the Evolution of Data Systems with Andrew Lamb of InfluxData The DataStack Show, Episode 186 ( podcast YouTube )