Contribute Media
A thank you to everyone who makes this possible: Read More

Turning Pandas DataFrames to Semantic Knowledge Graph – Cheuk Ting Ho (PyCon Taiwan 2021)

Description

Day 1, 14:05-14:35

Abstract

Remember how many times you look up “how to do this in pandas”? Though it is the most popular data handling library in Python, it is quite complicated due to the rigidness of storing data in tabular formats. This is most obvious when the data stored is imported from a JSON file and end up having multiple layers of objects. At this point, you wished for a data structure that let you store data with objects and subclasses, just like in object-orientated programs. The answer? Semantic knowledge graphs.

Description

In this talk, Cheuk will first introduce what is semantic knowledge graphs. It’s building block: triples, and how all data can be described will them - with objects and properties. Cheuk will assume no prior knowledge and will explain via examples and visualization with the TerminusDB model builder - a graphical interface that allows you to build schemas for semantic knowledge graphs.

In the next part, Cheuk will show how to construct a schema based on a pandas DataFrame. With the Python client of TemrinusDB, schema can be built programmatically follow by importing the data in the DataFrame. In this part, basic Python knowledge is assumed. In this part, Cheuk will show the internals of pandas, dissecting it and reconstruct a knowledge graph schema. Cheuk will also show the code that transforms the data and insert them in the prepared graph.

Finally, Cheuk will visualize the graph in a customized interactive graph visualization in Jupyter notebook.

Slides not uploaded by the speaker. HackMD: https://hackmd.io/@pycontw/2021/%2F%40pycontw%2FHJZaKgtzt

Speaker: Cheuk Ting Ho

Cheuk has been a Data Scientist in various companies which demands high numerical and programmatical skills, especially in Python. To follow her passion for the tech community, now Cheuk is the Developer Relations Lead at TerminusDB - an open-source graph database. Cheuk maintains its Python client and engages with its user community daily. Besides her work, Cheuk enjoys talking about Python in personal streaming platform and MidMeetPy podcast. Cheuk has also been a guest speaker at Universities and various conferences. On top of speaking at conferences, Cheuk also participates as organizers. Conferences that Cheuk has organized include EuroPython(which she is a board member of), PyData Global and Pyjamas Conf. Believing in gender equality, Cheuk constantly organizes workshops and mentored sprints to support Tech Diversity and Inclusion.

Details

Improve this page