Description
Data quality remains a core concern for practitioners of machine learning, data science, and data engineering, and in recent years specialized packages have emerged to validate and monitor data and models. However, as the open source community iterates on data frameworks – notably, highly performant entrants such as Polars – data quality libraries need to catch up to support them. In this talk, you will learn about Pandera and its journey from being a pandas-only validator to a generic tool for testing arbitrary data containers so that it can provide a standardized way of creating data validation tools.