Contribute Media
A thank you to everyone who has made this possible: Read More

Messy Sensor Data: A Programmer’s Cleaning Guide


Humans love to invent. From sundials to clocks, timers, and embedded computer sensors, humans have a symbiotic relationship with their creations. They create and maintain sensors like weather stations, and these sensors tell them what the weather is like outside. Inventions like the weather stations are so widely used that most people never imagine them to malfunction. Even worse, sensors might function correctly enough to record data, but also defective enough to record the wrong numbers. How do we make the best of our data? In this talk, we will discuss different ways to clean up inconsistent data formats and remove invalid samples, using weather station as an example. Then, we will present tools to efficiently snap our data points to a grid, and fill in gaps to tidy up our data. Finally, we will look at how to prepare and store data for use in data science. This article uses Python and open source packages, but the concept will be applicable to all general programming languages.


Improve this page