Contribute Media
A thank you to everyone who has made this possible: Read More

Estimating stock price correlations using Wikipedia

Description

PyData Berlin 2016

Building an equities portfolio is a challenging task for a finance professional as it requires, among others, future correlations between stock prices. As this data is not always available, in this talk I look at an alternative to historical correlations as proxy for future correlations: using graph analysis techniques and text similarity measures based on Wikipedia data.

According to Modern Portfolio Theory, assembling a portfolio involves forming expectations about the individual stock's future risk and return as well as future correlations between stock prices. These future correlations are typically estimated using historical stock price data. However, there are situations where this type of data is not available, such as the time preceding an IPO.

In this talk I look at an alternative to historical correlations as proxy for future correlations: using graph analysis techniques and text similarity measures in order to estimate the correlation between stock prices.

The focus of the analysis will be on companies listed on the Frankfurt Stock Exchange which form the DAX. I am going to use Wikipedia articles in order to derive the textual description for each company. Additionally, I will use the Wikipedia category structure to derive a graph describing relations between companies.

The analysis will be performed using the scikit-learn and networkX libraries and example code will be available to the audience.

https://github.com/deliarusu/wikipedia-correlation

Slides: https://speakerdeck.com/deliarusu/estimating-stock-price-correlations-using-wikipedia

Improve this page