A look at how to get started with MongoDB using Python. The talk will cover basic concepts, a brief walkthrough of some more advanced features, and how Texas A&M is using MongoDB and Python to solve some large data problems.
In the last month, we've started an effort to aggregate our logs so we can do some real-time, holistic log analysis. Among other things, we are tracking failed logins across all points of entry, identifying possibly compromised accounts (simultaneous logins from multiple IPs, geographically disparate logins), and identifying high-volume mailers across multiple mail relays. The talk will include an introduction to MongoDB (NoSQL, data structures, querying, indexing, differences from relational databases, etc); some important performance and reliability features like its support for replica sets, sharding, and map/reduce; and some very cool features like GridFS and geospatial indexing. All of this, with the exception of database configuration, will be demoed with MongoDB's python client.