Contribute Media
A thank you to everyone who makes this possible: Read More

Analysing user behaviour - from histograms to random forests (PyData)

Description

The goal is to give the audience a roadmap for analysing user data using python friendly tools. I will touch on many aspects of the data science pipeline from data cleansing to building predictive data products at scale. I will start gently with pandas and dataframes and then discuss some machine learning techniques like kmeans and random forests in scikitlearn and then introduce Spark for doing it at scale. I will focus more on the use cases rather than detailed implementation. The talk will be informed by my experience and focus on user behaviour in games and mobile apps.

Details

Improve this page