There is an abundance of data in social media sites (Wikipedia, Facebook, Instagram, etc.) which can be accessed through web APIs. But how do we know that the data from the Wikipedia article on "Golden Gate Bridge" goes along with the data from "Golden Gate Bridge" Facebook page? This represents an important question about integrating data from various sources.
In this talk, I'll outline important aspects of structured data mining, integration and entity resolution methods in a scalable system.