Moritz Gronbach - What's the fuzz all about? Randomized data generation for robust unit testing [EuroPython 2015] [23 July 2015] [Bilbao, Euskadi, Spain]
In static unit testing, the output of a function is compared to a precomputed result. Even though such unit tests may apparently cover all the code in a function, they might cover only a small subset of behaviours of the function. This potentially allows bugs such as heartbleed to stay undetected. Dynamic unit tests using fuzzing, which allows you to specify a data generation template, can make your test suite more robust.
In this talk, we demonstrate fuzzing using the hypothesis library. Hypothesis is a Python library to automatically generate test data based on a template. Data is generated using a strategy. A strategy specifies how data is generated, and how falsifying examples can be simplified. Hypothesis provides strategies for Python's built-in data types, and is easily customizable.Since test data is generated automatically, we can not compare against pre-computed results. Instead, tests are usually done on invariants of functions. We give an overview of such invariants.
Finally, we demonstrate how we use fuzzing to test machine learning algorithms at Blue Yonder.