PyData Berlin 2016
An unbiased view on media reports requires understanding the political bias of a text. This talk shows how basic tools from machine learning and natural language processing combined with publicly available data can be turned into assistive technology for automatically determining the political bias of a text. Some common pitfalls and example applications will be discussed.
Every day media generate large amounts of text. Getting an unbiased view on what media report on requires an unbiased sample of media content. In many cases it is obvious which political bias an author has. In other cases some expertise is required to judge the political bias of a text. Assistive technology for estimating the political bias of texts can be helpful in this context, especially for scaling things up. We investigated to what extent political party affiliation can be predicted from textual content with basic machine learning tools. We used the text of speeches and discussions in the German parliament as well as texts from party manifestos to train classifiers that predict political party affiliation or political views based on standard text features. Results indicate that automatic classification of political affiliation and political views is possible with well above chance accuracy. We hope that this work will eventually be helpful for unbiased political education in the presence of massive amounts of media content. We show some web applications of how the models can be used in combination with classical topic models to analyse texts for which the party affiliation is not clear, such as news articles.
GitHub Repo: https://github.com/felixbiessmann/