news image

Publishing to Matatika Mobile from a Jupyter Notebook

Published on September 24, 2020

What a busy month so far!  In the last 2 weeks, we’ve pushed 250 changes into production.  Mostly new features based on your feedback.  Keep them coming!

Our Data Scientists and Data Heroes told us they need a better way to collaborate on their insights, and one that works with their existing tools.

You told us that you love Jupyter Notebooks:So we released a python library to publish your insights directly into your Matatika Mobile workspace
You said you need to collaborate with your colleagues:We added comments, likes, and views of datasets to the API and Matatika Mobile
You said you need a command line to integrate change management and publishing into your DataOps:So we added a CLI that works in any existing DevOps / DataOps pipeline.

 

 

We’re going to tell you more about each of these features over the coming weeks.  For today, let’s get technical and dive into a Jupyter Notebook and publish a dataset.  

It is worth saying that, while we are Beta Testing, we are working with data scientists that want to get into this level of detail.

If this isn’t you, don’t worry if the article stops making sense in a paragraph or two.    We’re working towards plain English AI. 

We will make Data Heroes of us all in no time!

What’s a notebook?

Imagine you’re an avid Data Scientist, and you’ve been hand hand-crafting some beautiful charts in your Python Jupyter Notebooks.  For a start, one of the great things about a Notebook is that you can document your findings and analysis, side by side.  Then share. For that reason, it is no surprise they are the data scientists’ favourite tool and there are more than 400,000 Notebooks just like this on Kaggle.com.

But you want to code in a Notebook and only share your results? 

Matatika Mobile has that feature.
With just 3 lines of code, including the import, you can publish your datasets to your team in a private workspace.

Sounds great.  Show me the code!

Prerequisites:

  • Python 3.7 or above installed
  • A verified Matatika account
  • A workspace created in the Matatika app
  • Data ready to publish

The Matatika Python Library allows a user to programmatically publish a dataset to a workspace, whether that be within a Jupyter Notebook or a Python script.

To install, run:

pip install matatika

To publish a dataset, simply create a new Matatika client object and call the publish method:

from matatika.client import MatatikaClient

# auth_token, endpoint_url, workspace_id and datasets initialisation assumed
matatika = MatatikaClient(auth_token, endpoint_url, workspace_id)
matatika.publish(datasets)

Data must be provided as a dictionary object and must conform to the following specification:

PathDescription
{dataset-alias}A workspace unique identifier string that the dataset can be referenced by – multiple datasets can be defined in a single datasets dictionary
{dataset-alias}.informationThe dataset display name
{dataset-alias}.questionsQuestions the dataset might help in answering (interpreted by the Matatika elastic search service, powered by BERT)
{dataset-alias}.descriptionA description of the dataset
{dataset-alias}.rawDataThe raw data of the dataset, conforming to the Google Charts specification
{dataset-alias}.visualisationThe visualisation metadata for the dataset, conforming to the Google Charts specification
datasets = {
‘planet-orbits’: {
‘information’: 'Planet Orbits in Our Solar System',
‘questions’: 'How many Earth-years does it take for Jupiter to orbit the sun?',
    	‘description’: '#Planet Orbits\nSun orbit data for all planets within our solar system.\n*Yes, Pluto is included!*',
‘rawData’: '[["Planet", "Orbit Distance (Light-hours)", "Orbit Duration (Earth-years)"],["Mercury", 0.3336, 0.2500],["Venus", 0.6300, 0.5833],["Earth", 0.8708, 1],["Mars", 1.3242, 1.9167],["Jupiter", 4.5287, 11.8333],["Saturn", 8.2997, 29.5000], ["Uranus", 16.7030, 84.0833], ["Neptune", 26.1883, 164.9167], ["Pluto", 33.8475, 248.0833]]',
‘visualisation’: '{"google-chart": {"chartType": "Bar"}}'
}
}

Now head to the Matatika app and you should see your new dataset published in the workspace context!

Once you’re registered, try it out for yourself with this Notebook.

Connect to Apps & Data now
Click on any of our connectors below to get instant insights.
Build a connector
Integrate your App or securely connect to your private data.
Learn more
Partner with us
Are you a data provider? We can work with you to publish your data.
Learn more

Byte Sized Insights

Stay up to date with Data and Insights as they become available.