Contribute Media
A thank you to everyone who has made this possible: Read More

Using MongoDB and Python for data analysis pipelines (PyData)

Description

MongoDB is a flexible, scalable, and ease to use way of storing your large data set. Python provides many useful data science tools (e.g. NumPy, SciPy, Scikit-learn, etc.). Unfortunately, they don't work well together, one of the bottlenecks is the inefficiency of loading MongoDB data into NumPy array.

This talk will discuss the concerns for creating operational data analytic pipelines, introduce Monary as alternative for loading data into NumPy, and give examples of accessing data with Monary, as well as how to build scalable data analysis pipelines using these open source tools.

Details

Improve this page