The Data Leaders Digest – Fifth Edition

Published on July 14, 2025

June 2025

The best insights come from real conversations, with real leaders making tough calls, solving messy problems, and figuring out how to scale their data teams in the real world. That’s why I started The Data Leaders Digest.

Each month, we share what’s actually happening inside data teams, pulled from our podcasts, live events, meetups, and honest behind-the-scenes conversations.

No theory. Just practical takeaways you can use right now to make better decisions, avoid common pitfalls, and keep moving forward.

This Month’s Focus

In this edition:

  • Why Fivetran’s connector pricing is still raising eyebrows
  • What Snowflake and Databricks summits revealed about vendor strategy
  • A closer look at the Google Cloud outage and what it exposed
  • Highlights from our June LinkedIn Live on data ROI
  • Season 2 finale of the Data Matas podcast with John Napoleon-Kuofie
  • Four new resource guides on architecture, storage, and pricing decisions
 

Market Insights

Fivetran’s Connector Pricing: Why the Model Misses the Mark

We talk a lot about Fivetran, particularly their approach to pricing and the ripple effects it’s having across data teams. The changes they introduced back in March are still having a big impact on how data leaders think, plan, and budget.

It’s not because we think Fivetran has a bad product. And it’s certainly not because we believe they’re trying to rip people off. We just think they’ve gone wrong, and a good example of that is their approach to connector pricing.

Let’s say your team runs 50 connectors.
Each one processes 10 million rows per month.
With Fivetran, pricing is now calculated per connectorper row.
Even though your total row volume is high (500 million), there’s no volume discount across your account, just 50 separate pricing bands that each max out individually.

That means you’re not rewarded for scale.
You’re paying 50x for the same level of processing power—because each connector hits a pricing ceiling on its own.

There are three main reasons we think this model creates unnecessary friction:

  1. Connector-based pricing limits scale. There’s no account-wide discount, just individual pricing bands that max out separately. That means you don’t benefit from growing your data operations, you just pay more.
  2. It makes budgeting harder than it needs to be. Each new data source adds incremental and often unpredictable cost. Forecasting becomes a guessing game, and finance teams are left trying to explain line items they don’t fully control.
  3. It penalises teams for doing more. The more data you use, the more expensive it gets, even if your infrastructure and team scale efficiently. That doesn’t feel like the right trade-off.

That means you’re not rewarded for scale. Each connector is priced separately, so volume discounts don’t kick in. The more connectors you have, the worse it gets, you’re paying premium rates again and again for the same level of processing.

Why this matters

Pricing isn’t just a finance issue. It shapes how data teams operate. When costs scale unpredictably, it limits what teams can do, delays delivery, and introduces tension with other parts of the business.

As the industry matures, we believe pricing models need to change too. They should support growth, not punish it, and reflect the value teams actually get from the platform.

 

Snowflake and Databricks Summits: Two Visions, One Challenge

Both Snowflake and Databricks held their flagship events this month, Snowflake Summit from 3rd to 6th June, and Databricks’ Data + AI Summit the week after, from 10th to 13th June. As you’d expect, there was a lot of excitement around AI, platform updates, and new product releases.

But the more interesting thread, at least for us, was how differently these two companies are approaching pricing.

Snowflake continues to focus on simplicity through abstraction. You pay for credits, they optimise the platform in the background, and the idea is that you don’t need to worry about the details.

Databricks takes almost the opposite stance. You run the platform in your own cloud, pay for what you consume, and take responsibility for tuning it. That gives you more control, but also more complexity.

It’s a useful contrast, but whichever way you lean, the core issue remains the same. Most pricing models are still designed around how the vendor operates, not how your team works. Whether you’re managing credits or DBUs, you’re still dealing with a level of abstraction that makes cost forecasting harder than it should be.

If you’d like a deeper look at how these two philosophies stack up, and what they reveal about the direction of data platform pricing, we unpack it all in the article below.

🔗 Read the blog: Snowflake vs Databricks—Who’s Really Playing Fair?

 

When Google Goes Down, What Happens to Your Data Team?

On 12th June, a major Google Cloud outage took down over 70 services, including BigQuery, Spotify, OpenAI, and Shopify. For data teams, it meant hours of firefighting: fixing broken pipelines, rerunning failed jobs, and explaining blank dashboards to stakeholders.

Nobody shipped anything new. Nobody optimised. For most teams, the day was lost to maintenance.

What this incident really exposed is how fragile many modern data stacks still are. With 15 – 20 tools stitched together, a single outage doesn’t just affect one system, it causes a cascade of failures across your warehouse, transformation layer, BI tools, and everything in between.

The reality is, most teams are still spending more time fixing problems than building solutions. And outages like this make that painfully obvious.

We’ve written a full breakdown of what went wrong, how it impacted real teams, and what can be done to build more resilient, reliable infrastructure going forward.

🔗Read the blog: What Google Cloud’s June Outage Really Cost Data Teams

 

LinkedIn Live Recap: Maximising Data ROI? Stack It With the Best

This session followed on from our May event on cutting ETL costs without sacrificing performance. That conversation highlighted a key takeaway: people, not platforms, are often your biggest cost. So this month, we looked at the flip side, how to get the most out of your team, your tools, and your structure.

  • Jack Colsy (Incident.io) spoke about the power of hiring senior engineers early. It’s not just about building fast, it’s about knowing what not to build. Experienced people move through ambiguity and keep things lean.
  • Harry Gollop (Cognify Search) shared how tooling decisions shape culture. Every new tool adds overhead, onboarding, alignment, duplication. Smart teams know when to say no.
  • And Aaron Phethean (Matatika) challenged us to think about cost differently. Sometimes the real expense isn’t switching tools, it’s sticking with ones that quietly slow your team down.

Key takeaways from the session

  • Senior hires cut through ambiguity and reduce long-term tech debt
  • Tools should save time, not create more coordination work
  • If your team is constantly reacting, your setup isn’t working
  • Most teams don’t need more dashboards, they need faster answers

Missed the event? 

📺 Watch the full replay here

 
 

Episode 6: Stop Scaling What You Don’t Understand

🎙️ John Napoleon-Kuofie, Analytics Engineer at Monzo

We wrapped up Season 2 of the podcast with a brilliant conversation featuring John Napoleon-Kuofie from Monzo. If you’ve ever inherited messy data models or felt the pressure to scale without clear foundations, this one will hit home.

John walked us through how his team at Monzo tackled more than 1,000 dbt models, most of them inherited, inconsistent, and poorly documented. Instead of trying to optimise what was there, they made a bold call: stop, simplify, and rebuild from first principles.

Some of the highlights:

  • Rebuild around real concepts, not legacy logic
  • Don’t test what you’re not willing to fix
  • Make your models AI-ready by making them understandable
  • Let small internal fixes scale across the company
  • Leave a trail of thinking, not just clean code

If your team is navigating technical debt or trying to prepare for AI on top of a shaky stack, John’s approach offers a practical path forward.

🎧 Listen to the full episode

📺 Watch the episode on YouTube

 

New Resources

Make the Right Call on Architecture and Cost

This month we’ve published four new guides to help you cut through architectural trade-offs, clarify storage decisions, and navigate ETL pricing with confidence:

  1. Choosing Between Star and Snowflake Schemas for Your Data Warehouse
    A practical guide to schema design, when to favour performance, when to prioritise storage, and how to safely test changes before you commit.
    🔗 Read the full article


  2. Data Lake vs Data Warehouse: What’s the Difference and Which Should You Choose?
    Stop thinking “either/or” this article explores hybrid architecture options that give you flexibility without compromising governance.
    🔗Read the full article
  3. Snowflake Columnar Storage: Why This Architecture Could Cut Your Analytics Costs by 70%
    Find out how Snowflake’s columnar design reduces query time, storage spend, and ETL compute, all without changing your stack.
    🔗 Read the full article


  4. Understanding Today’s ETL Pricing Landscape: Column vs Row Approaches
    A clear breakdown of modern ETL pricing models, what each one means for your data strategy, and how to avoid being penalised for growth.
    🔗 Read the full article

Found this edition useful? Pass it on!

Help us reach more data leaders who need these insights:

→ Forward this email to a colleague who’d benefit
→ Share in LinkedIn with your network
→ Join our growing community of 180+ data leaders getting monthly insights

Follow Matatika on LinkedIn | Subscribe for More | Visit Our Website

Connect to Apps & Data now
Use Matatika to rapidly produce insights from more than 500+ apps and community sources
Speak to an expert
Build a connector
Integrate your App or securely connect to your private data.
Learn more
Partner with us
Are you a data provider? We can work with you to publish your data.
Contact Us

Data Leaders Digest

Stay up to date with the latest news and insights for data leaders.