Link Search Menu Expand Document

Running Your Data Import Locally

  • Time required: 15 minutes

Prerequisites

You must have:

  • Own or have administrative access to a workspace
  • Git installed
  • Python >=3.7 installed
  • Meltano installed (virtual environment recommended)

Introduction

Data import pipelines in Matatika workspaces are run using Meltano. Each Matatika workspace is backed by a repository containing a Meltano project hosted on GitHub, which can easily be cloned locally in order to run data import pipelines external to the Matatika platfrom.


Setup

  1. Within the Matatika app, switch to the workspace that contains the data import pipeline you wish to run locally
  2. Navigate to the workspace ‘Settings’ page and copy the repository URL
  3. Clone the workspace to your local system
     git clone https://github.com/MatatikaBytes/example-workspace
    
  4. Change into the cloned directory and create a new .env file
  5. Head back to the Matatika app and navigate to the workspace ‘Data Imports’ page, and expand the data import pipeline you wish to run locally
  6. Select the ‘Environment’ tab and click the .env text field to copy the environment configuration
  7. Paste the copied environment configuration into the .env file you created earlier
     TAP_EXAMPLE_CLIENT_ID=clientid
     TAP_EXAMPLE_CLIENT_SECRET=clientsecret
     TAP_EXAMPLE_START_DATE=2022-01-01T00:00
     TARGET_EXAMPLE_HOST=example.host.com
     TARGET_EXAMPLE_PORT=1234
     TARGET_EXAMPLE_DB=db
     TARGET_EXAMPLE_SCHEMA=schema
     TARGET_EXAMPLE_USERNAME=username
     TARGET_EXAMPLE_PASSWORD=password
    

Your local workspace repository should now be set up similar to this one: Github Example Link

Running Locally

(activate your virtual environment if you are using one for Meltano)

  1. Install the extractor
     meltano install extractor tap-example
    
  2. Install the loader
     meltano install loader target-example
    
  3. Run your data import pipeline
     meltano elt tap-example target-example