Writing highly scalable and provenanceable data pipelines

Writing highly scalable and provenanceable data pipelines

YouTube

Description

In this talk we are gonna explore launching and maintaining highly scalable data pipelines using Kubernetes. We are gonna go through the process of setting up a Pachyderm cluster and deploying Python-based data processing workloads. This setup enables teams to develop and maintain very robust data pipelines, with the benefits of autoscaling clusters and quick code iteration.

Details

Event: PyCon SE 2019
Language: English
Media URL: YouTube
Tags: kubernetes, data pipeline
Related URLs:
- Conference schedule

Improve this page