Modern ETL pricing models often charge based on row counts, which fundamentally misaligns with how analytical systems actually process data—via columnar methods focused on compute efficiency and performance. This disconnect not only creates technical debt and unpredictable costs but also diverts engineering resources away from optimisation and innovation toward managing arbitrary billing constraints.
Row-based ETL pricing models create unpredictable, disproportionately high costs that penalize business growth, disrupt budgeting, and divert engineering resources from innovation to cost control. Performance-based pricing, aligned with actual infrastructure usage, offers a more predictable and strategic alternative that supports scalable data operations without financial volatility.
The June 12, 2025 Google Cloud outage revealed a harsh truth: modern data stacks often create more firefighting than innovation, as fragmented toolchains and so-called “managed” services increase maintenance burdens, costs, and risk. Matatika’s Mirror Mode offers a risk-free path out of this cycle by allowing teams to validate a more stable, antifragile infrastructure—enabling a shift from constant maintenance to strategic, high-impact data work.
Many data teams avoid proper data modelling due to its perceived complexity, often relying on ad-hoc structures that lead to performance issues and eroded trust in analytics. The most effective teams use flexible schema strategies, balancing star and snowflake designs, to align with their specific performance, storage, and maintenance needs.
Many organisations feel forced to choose between a data lake or a data warehouse due to cost, complexity, or skill constraints, often settling for suboptimal setups that limit agility and inflate costs. Leading data teams are now adopting hybrid lakehouse architectures and transition tools like Mirror Mode to unify storage, improve analytics speed, and cut spend, without the disruption of traditional migrations.
Monzo’s data team, led by John Napoleon-Kuofie, chose to rebuild their data platform from first principles after inheriting over 1,000 inconsistent DBT models, prioritizing clarity and maintainability over scale. Their experience—shared on the Data Matas podcast—underscores a broader industry shift: sustainable innovation in data and AI begins with simplified models, clear ownership, and a culture that empowers individuals to drive meaningful change.
Snowflake and Databricks take fundamentally different pricing approaches—Snowflake offers managed optimisation with less control, while Databricks provides flexibility with greater complexity. The real shift in value lies in adopting warehouse-agnostic, performance-based ETL pricing that aligns cost with actual infrastructure use, offering transparency and freedom from vendor lock-in.
Most ETL pricing models haven’t kept pace with the evolving data landscape, leaving many teams overpaying for row-based processing that penalises growth and efficiency. This blog advocates for a shift toward performance-based pricing aligned with column-oriented processing, offering scalable, transparent cost control that reflects actual infrastructure usage rather than arbitrary metrics.
Snowflake’s columnar storage architecture delivers faster analytics and lower costs by scanning only relevant data, compressing storage intelligently, and optimising queries automatically. This design enables significant performance gains and cost reductions across ETL, storage, and compute—transforming how businesses scale data operations and consume insights.
This blog explores how data teams can strategically reduce costs without compromising performance, drawing insights from a recent LinkedIn Live featuring experts from Select.dev, Cube, and Matatika. It outlines five key strategies, from optimising human productivity to safely switching platforms, backed by real-world examples and practical implementation steps.