Contribute Media
A thank you to everyone who has made this possible: Read More

Simple ETL in python 3.5+ with Bonobo

Translations: en

Description

Simple is better than complex, right? That’s true for data pipelines too.

For more than 5 years, I hacked together extract-transform-load (ETL) processes in various different positions (ETL is just a fancy term for «bunch of things that take data somewhere and put it elsewhere, eventually transformed»).

I did it as a founder, as a consultant, as a technical co-founder, for some side projects, and now in a big corp (to be continued…).

In each case, I felt frustrated with the tools available, and in some serious cases, I had to hack things myself to get the job done.

https://www.bonobo-project.org/

Bonobo is the repackaging of my past experiences for python 3.5+, and grasping the basics should not take more than the length of the presentation.

Topics outline (subject to small changes) :

  • INTRO : State of the art / different tools for different needs.
  • Where does it come from.
  • Writing a data processor.
  • Running and monitoring data jobs.
  • OUTRO : The road ahead.
  • Q&A

Bonobo is the glue you need to tie together regular functions in a transformation graph (think unix pipes). Execution strategies are abstracted so you can focus on the real operations. As a result, you can engineer simple and testable systems, using the same good computer development practices as you use in -insert your favorite field here-.

Spoiler : there is no «big data» in this talk.

Details

Improve this page