Description
PySpark is a great tool for doing big data ETL pipeline. While designing a big data pipeline, which is easy to maintain with a holistic view, simple to spot bottleneck is difficult. Not to say enable analytics on ETL pipelines. Rockie Yang will share his experiences on build effective ETL pipeline with PySpark.