Help us!

Take some time to transcribe PyCon 2014 talks! Click on the "Share" button below the video and then "Subtitle" to get started.

Cassandra for Python Developers

Summary

Apache Cassandra is an open source, distributed (NoSQL) database. This will give a high level introduction to Cassandra and its data model; it will detail the features of pycassa, the Python client library for Cassandra, and how to interact with Cassandra through it.

Description

Being non-relational, Cassandra's data model is fundamentally different from that of a relational database. In addition, it uses an RPC based API rather than a query language. On top of that, Cassandra is a distributed database, so the client must be aware of and interact with multiple nodes in the cluster. All of these attributes of Cassandra make the client libraries a different experience. Fortunately, the Python client library is the easiest way to use Cassandra. This talk will start with a high level overview of the clustering model of Cassandra then its data model. A large portion of the talk will cover the pycassa methods that interact with the data model of Cassandra, i.e. inserting, fetching, and removing data. A small amount of time will be dedicated to describing connection pooling in pycassa -- how it handles node failures, distributes requests, etc. The final 10 minutes will be devoted to Q&A.