MongoDB is a flexible, scalable, and ease to use way of storing your large data set. Python provides many useful data science tools (e.g. NumPy, SciPy, Scikit-learn, etc.). Unfortunately, they don't work well together, one of the bottlenecks is the inefficiency of loading MongoDB data into NumPy array.
This talk will discuss the concerns for creating operational data analytic pipelines, introduce Monary as alternative for loading data into NumPy, and give examples of accessing data with Monary, as well as how to build scalable data analysis pipelines using these open source tools.