A talk about techniques to classify what data we share and with whom. In the corporate and government world, there is a plethora of security models that determines who has access to what and who knows about it. In this presentation, we define the novel concept of degrees of separation in data classification security models and use machine learning techniques to infer non public data.
We all use techniques to classify what data we share and with whom. In this presentation we look at a novel of combining the aggregate public data available to retrieve non public information about users. We define the way to infer private data with a certain probability from public data as a degree of separation on that data.
We use machine learning techniques to first do feature extraction from this combined public data and utilise predictive models to infer non public data. We present an analysis starting with public twitter feeds from data scientists and then define three degrees of separation for that data.
In conclusion, we show a model that combines the heuristics of feature selection with the security model of data classification to define the degrees of separation.