Real-time feeds of user activity from various apps such as Twitter, Foursquare, and others are becoming increasingly available. These 'digital footprints' provide new means to understand how individuals utilize the places and spaces of urban environments. We present a light-weight framework for real-time event detection in a city based on existing SciPy libraries and real-time Twitter streams.
In this paper, we utilize real-time 'social information sources' to automatically detect important events at the urban scale. The goal is to provide city planners and others with information on what is going on, and when and where it is happening. Traditionally, this type of analysis would require a large investment in heavy-duty computing infrastructure, however, we suggest that a focus on real-time analytics in a lightweight streaming framework is the most logical step forward.
Using online Latent Semantic Analysis (LSA) from the `gensim <https://github.com/piskvorky/gensim/>`__ Python package, we extract 'topics' from tweets in an online training fashion. To maintain real-time relevance, the topic model is continually updated, and depending on parameterization, can 'forget' past topics. Based on a set of learned topics, a grid of spatially located tweets for each identified topic is generated using standard numpy and scipy.spatial functionality. Using an efficient streaming algorithm for approximating 2D kernel density estimation (KDE), locations with the highest density of tweets on a particular topic are located. Locations are semantically labeled using the learned topics, based on the assumption that events can be directly tied to a particularly popular topic at a particular location.