Description
This session is for people who would like to move from pure play analytics to heavy data lifting and data engineering applications. With a background in Computer science and Quantitative economics, I blend these two approaches to bring quick wins(and Fail cheaper) to solve some of most interesting problems around us today- for fun and profit.
The four problems I will discuss are around –
- Scoring your connections to find your net worth in LinkedIn [Data structures, Algorithms , Data mining & Visualization]
- Building your custom distance metrics (with Graphs as base data-structure) and finger keying – with applications for hand-collected data-sets, susceptible to entirely different distance metrics, than traditionally explored [Algorithms]
- Building relevant Job feeds in Linkedin (TF-IDF) and (possibly) hacking around Job applications [Relevance Algorithms, Linguist pre-processing & visualization]
- Developing a custom batch aggregation platform - with competing system goals [developing scoring metrics, concept of centrality & formal constructs to measure recency]
Most of all you will learn the why and why not of things, that I will
keep coming back to as I discus the problems above. This I feel is the
most important differentiator between coding well and coding for
scale, and helps build a structured thought process to problem
solving.
By the end of the session, you will have seen a blend of tools ranging
from Python (Algorithms & data mining & Graph Theory) , Python, R &
ink space (Graphical representation & visualization)
The session will be heavy on algorithms, and thinking data structures, so to make most of this, you need – a background/Interest in Computer science and Quantitative vigor with some hands-on coding experience, and a mind that wants to learn more.