Description
[EuroPython 2023 — South Hall 2B on 2023-07-20]
https://ep2023.europython.eu/session/solving-data-problems-in-management-accounting
Controllers deal with numbers all day long. They have to check a lot of data from different sources. Often the reports contain erroneous or missing data. Identifying outliers and suspicious data is time-consuming.
This presentation will introduce a Small Data Problem-End2End workflow using statistical tools and machine learning to make controllers' jobs easier and help them be more productive.
We will demonstrate how we used amongst others,
- [scipy](https://scipy.org/)
- [pandera](https://pandera.readthedocs.io/en/stable/)
- [dirty cat](https://dirty-cat.github.io/stable/)
- [nltk](https://www.nltk.org/)
- [fastnumbers](https://pypi.org/project/fastnumbers/)
to create a self-improving system to automate the screening of reports and report outliers in advance so that they can be eliminated more quickly.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License http://creativecommons.org/licenses/by-nc-sa/4.0/