The Full Data Stack 10
Claude's new model, DuckDB is everywhere, a new file format and LinkedIn memes
Hey y’all, I’m Hoyt!
Each week, I share thoughts and findings in the world of Data and AI. I span the entire data stack and keep you up to date with the shifting landscape.
Not Subscribed? Come join! ⊂(◉‿◉)つ
AI and Models
Did you hear Claude 4.5 dropped this week? Yeah so did everyone else. I was halfway through writing a tech review for the AIR python web framework when I switched over to Claude 4.5 to vibe code the rest of the review’s app demo. Let’s just say that the code output was a pretty major upgrade. I was using 4 Sonnet before and it kept using bad color schemes and emojis. When I coded another section with Claude 4.5 it gave that bad boy a gradient background and nice form design. Feeling good about this one.
ByteByteGo has hopped on the MCP bandwagon, hurray! I’ve written quite a bit about MCP and it’s possibilities in Data Analysis, but when a major player like ByteByteGo finally builds a beautiful animated workflow diagram it’s a different level. In many ways, it’s like I’m watching my favorite underground band go mainstream. I’m happy for MCP…but I can’t help but gripe about how I was early to the party.
Claude 4.5 does feel more comprehensive to me. It feels like it can do anything, and an upgrade from Claude 4. I had heard it could make PowerPoint slides and I gave that a try. It made me a python script to make the slide. This feels like it would work.
One of Anthropic’s AI researchers was on a podcast talking about the current state and future of LLM’s. He believes AGI is possible with just larger models and more compute. This flies in the face of OG Artificial Intelligence leaders who thinks LLM’s are a dead end.
Sebastian Raschka, PhD offers us the most insane breakdown of how to eval an LLM that I’ve ever seen. It also comes with pictures which my ADHD brain loves 😍.
Engines and Libraries
Polars has secured $12M in a series A funding round. This is significant because it means Polars is going from an open source data library to an actual company trying to solve a problem in the market. They have started with Polars Cloud but what else is on the horizon?
DuckDB has a new installation page. Why is this interesting? Because it is just another example of how they pour over how to make the DevEx as frictionless as possible.
There’s an interesting DuckDB extension called http_client. I caught this on LinkedIn and haven’t had a chance to try it but it feels like I should just be able to hit an api endpoint on the internet with this right?
Carnegie Melon is showcasing how to use DuckLake with MotherDuck on Oct 6, 2025. If you read this after, it looks like they are going to video it.
FYI, I just made a video on my Youtube channel looking at MotherDuck and I freaking loved it.
High Performance DE Newsletter talks about DuckDB, Iceberg and AWS Glue. I need to read this myself because AWS Glue has eluded me.
Data Engineering
Martin Debus shows you how to build a python logger framework for Databricks. Martin goes deep into what a Data Lakehouse is and what you really need to worry about to make it airtight.
Simon Späti Reminds us that Data Engineering teams should think about…GASP…BUSINESS NEEDS! 🔥
High Performance DE Newsletter gives us an exhaustive test setting for the Rest Iceberg Catalog and also how insane Spark is to set up sometimes.
Alejandro Aboy walks you through a project to understand a use case for dbt. I also didn’t realize he created a cool AI Data Generator. It’s been bookmarked.
File Formats and Storage
Someone has come out and offered an new storage file format that is meant to challenge parquet. If you look at the original white paper you’ll see a familiar name, Wes McKinney. Given everything Wes touches turns to gold, best to pay attention. Still, Iceberg is married to parquet so it’s not going to break the internet anytime soon.
I loved this fantastic break down on physical and file storage by Erfan Hesami. I have an almost unhealthy fascination with file formats and hardware storage concepts and architecture. I love Parquet, Apache Arrow, DuckDB, LSM Trees and standard block storage databases like MYSQL. I love it all. I’ve read chapters of Data Intensive Applications multiple times. So I have high expectations. This article passed the test.
LinkedIn Memes
Alex Chiou reminds us that the tech interview process is completely broken.
Nick Valiotti is kinda the GOAT right now for data memes on LinkedIn.









Thanks you for linking me!
lol thanks hoyt!