June 2025The best insights come from real conversations, with real leaders making tough calls, solving messy problems, and figuring out how to scale their data teams in the real world. That’s why I started The Data Leaders Digest. Each month, we share what’s actually happening inside data teams, pulled from our podcasts, live events, meetups, and honest behind-the-scenes conversations. No theory. Just practical takeaways you can use right now to make better decisions, avoid common pitfalls, and keep moving forward. |
This Month’s FocusIn this edition:
|
|
Market InsightsFivetran’s Connector Pricing: Why the Model Misses the Mark We talk a lot about Fivetran, particularly their approach to pricing and the ripple effects it’s having across data teams. The changes they introduced back in March are still having a big impact on how data leaders think, plan, and budget. It’s not because we think Fivetran has a bad product. And it’s certainly not because we believe they’re trying to rip people off. We just think they’ve gone wrong, and a good example of that is their approach to connector pricing. Let’s say your team runs 50 connectors. That means you’re not rewarded for scale. There are three main reasons we think this model creates unnecessary friction:
That means you’re not rewarded for scale. Each connector is priced separately, so volume discounts don’t kick in. The more connectors you have, the worse it gets, you’re paying premium rates again and again for the same level of processing. Why this matters Pricing isn’t just a finance issue. It shapes how data teams operate. When costs scale unpredictably, it limits what teams can do, delays delivery, and introduces tension with other parts of the business. As the industry matures, we believe pricing models need to change too. They should support growth, not punish it, and reflect the value teams actually get from the platform. |
|
Snowflake and Databricks Summits: Two Visions, One ChallengeBoth Snowflake and Databricks held their flagship events this month, Snowflake Summit from 3rd to 6th June, and Databricks’ Data + AI Summit the week after, from 10th to 13th June. As you’d expect, there was a lot of excitement around AI, platform updates, and new product releases. But the more interesting thread, at least for us, was how differently these two companies are approaching pricing. Snowflake continues to focus on simplicity through abstraction. You pay for credits, they optimise the platform in the background, and the idea is that you don’t need to worry about the details. Databricks takes almost the opposite stance. You run the platform in your own cloud, pay for what you consume, and take responsibility for tuning it. That gives you more control, but also more complexity. It’s a useful contrast, but whichever way you lean, the core issue remains the same. Most pricing models are still designed around how the vendor operates, not how your team works. Whether you’re managing credits or DBUs, you’re still dealing with a level of abstraction that makes cost forecasting harder than it should be. If you’d like a deeper look at how these two philosophies stack up, and what they reveal about the direction of data platform pricing, we unpack it all in the article below. 🔗 Read the blog: Snowflake vs Databricks—Who’s Really Playing Fair? |
|
When Google Goes Down, What Happens to Your Data Team?On 12th June, a major Google Cloud outage took down over 70 services, including BigQuery, Spotify, OpenAI, and Shopify. For data teams, it meant hours of firefighting: fixing broken pipelines, rerunning failed jobs, and explaining blank dashboards to stakeholders. Nobody shipped anything new. Nobody optimised. For most teams, the day was lost to maintenance. What this incident really exposed is how fragile many modern data stacks still are. With 15 – 20 tools stitched together, a single outage doesn’t just affect one system, it causes a cascade of failures across your warehouse, transformation layer, BI tools, and everything in between. The reality is, most teams are still spending more time fixing problems than building solutions. And outages like this make that painfully obvious. We’ve written a full breakdown of what went wrong, how it impacted real teams, and what can be done to build more resilient, reliable infrastructure going forward. 🔗Read the blog: What Google Cloud’s June Outage Really Cost Data Teams |
|
LinkedIn Live Recap: Maximising Data ROI? Stack It With the BestThis session followed on from our May event on cutting ETL costs without sacrificing performance. That conversation highlighted a key takeaway: people, not platforms, are often your biggest cost. So this month, we looked at the flip side, how to get the most out of your team, your tools, and your structure.
Key takeaways from the session
Missed the event? |
|
Episode 6: Stop Scaling What You Don’t Understand🎙️ John Napoleon-Kuofie, Analytics Engineer at Monzo We wrapped up Season 2 of the podcast with a brilliant conversation featuring John Napoleon-Kuofie from Monzo. If you’ve ever inherited messy data models or felt the pressure to scale without clear foundations, this one will hit home. John walked us through how his team at Monzo tackled more than 1,000 dbt models, most of them inherited, inconsistent, and poorly documented. Instead of trying to optimise what was there, they made a bold call: stop, simplify, and rebuild from first principles. Some of the highlights:
|
|
New ResourcesMake the Right Call on Architecture and Cost This month we’ve published four new guides to help you cut through architectural trade-offs, clarify storage decisions, and navigate ETL pricing with confidence:
|
Found this edition useful? Pass it on!Help us reach more data leaders who need these insights: → Forward this email to a colleague who’d benefit Follow Matatika on LinkedIn | Subscribe for More | Visit Our Website |
Stay up to date with the latest news and insights for data leaders.