Jim Blomo

Number of videos:
1
mrjob: Snakes on a Hadoop
PyCon US 2014
Jim Blomo
Recorded: April 11, 2014Language: English

This tutorial will take participants through basic usage of mrjob by writing analytics jobs over Yelp data. mrjob lets you easily write, run, and test distributed batch jobs in Python, on top of Hadoop. Hadoop is a MapReduce platform for processing big data but requires a fair amount of Java boilerplate. mrjob is an open source Python library written by Yelp used to process TBs of data every day.