(Simon Prickett) Being right all the time isn't necessarily the best idea. This talk examines how to count distinct items from a firehose of data, how to determine if we've seen a given item before, and why absolute accuracy may be impractical when doing so.
Probabilistic data structures trade accuracy for approximate results, speed and economy of resources. They provide fast, scalable solutions to problems such as counting likes on social media posts, or determining which articles on a website a user has previously read.
I'll introduce the Hyperloglog and Bloom Filter, explain how they work at a high level, and demonstrate different ways in which each can be leveraged in Python.
python, pycon, australia, programming, conference, technical, pyconline, developers, panel, sessions, libraries, frameworks, community, sysadmins, students, education, data, science
Videos licensed as CC-BY-NC-SA 4.0
PyCon AU is the national conference for the Python programming community, bringing together professional, student and enthusiast developers, sysadmins and operations folk, students, educators, scientists, statisticians, and many others besides, all with a love for working with Python.
PyCon AU informs the country’s Python developers with presentations, tutorials and panel sessions by experts and core developers of Python, as well as the libraries and frameworks that they rely on.
Produced by Next Day Video Australia: https://nextdayvideo.com.au
Sat Sep 11 16:45:00 2021 at Platypus Hall