Link Search Menu Expand Document

Concepts

An quick walk through of some fundamental Matatika concepts and their usage.


Modern Data Stack

There are many opinions on the “Modern Data Stack” - frankly one size cannot fit both Google scale and a five-person startup. At Matatika, we have selected components that we believe provide you the best combination of Scalability, Performance, Flexibility and lowest total cost-of-ownership. Ultimately, our goal is to provide a modern data stack with No Limits.

Data Ops

Extraction

Meltano is primarily used to manage the data extraction in your Matatika workspace. Matatika are active contributors to Meltano and will continue to invest in other technologies that advance our customers’ ability to implement Data Ops extraction methodologies.

Warehousing

PostgreSQL is the default data storage technology in your Matatika workspace. Data technologies will continue to advance and therefore, we believe it is vital your stack be database agnostic. Currently, SQL with ODBC/JDBC are the most widely adopted business intelligence data interfaces. The Matatika Platform supports any JDBC-compliant database or serverless data warehouses, such as Google BigQuery or AWS Athena.

Transformation

dbt is the default transformation technology in your Matatika workspace. SQL-based transformations in code and testing considerations by design, provide the fundamentals required to support Data Ops methodologies at the transformation layer. The Matatika Platform is able to execute any transformation technology in isolated containers.

Orchestration

Spring Cloud Dataflow is the main orchestration technology in your Matatika workspace. The Matatika Platform takes care of scheduling, log collection, and credential management with isolated containers for all workspace jobs.

Catalog

Your datasets and models are published, indexed and searchable via the Matatika API, CLI, or Configuration Repository. The Matatika Platform provides a personalised feed of insights by scoring your datasets by usage.

Analytics

Jupyter Notebooks and visualisations in ChartJS and Google Charts formats can be published to your workspace. The Matatika dataset format gives you full control of the chart visualisation as code, supporting Data Ops through to the analytics layer of your stack.

Modern Data Stack

Data Ops

Your Matatika workspace is a unique Business Intelligence as Code solution. All artifacts are managed in a Git Configuration Repository with credentials securely stored inside the platform. We enable you to deliver a robust analytics solution without restricting you to any specific Data Ops methodology.

Data Ops