Contribute Media
A thank you to everyone who makes this possible: Read More

The Zen of the Bronze Layer: Ingestion of Data with Unstable Schema

Description

In the medallion data architecture, the bronze layer is for staging incoming raw data before further transformation and cleaning. Ideally, tabular CSV data undergoes minimal transformations and is queryable upon ingestion; however, third party data sources can contain unstable schema that make this challenging even using pandas. With native Python data structures and a more flexible data schema, such this messy data can more reliably be ingested for cleaning and monitoring.

For more information, see https://pybay.org/

Follow us on LinkedIn https://www.linkedin.com/company/pybay

Details

Improve this page