Sentiment analysis aims at extracting opinions from texts written in natural language, typically reviews or comments on social sites and forums. SpaCy already provides mechanisms for dealing with natural languages in general but does not offer means for sentiment analysis.
This talk gives a short introduction to sentiment analysis in general and shows how to extract topics and ratings by utilizing spaCy’s basic tools and extending them with a lexicon based approach and simple Python code to consolidate sentiments spread over multiple words.
Topic covered are:
- What is sentiment analysis?
- Levels of sentiment detection
- Representing opinions
- Splitting texts in sentences and words.
- Finding the base word (lemma)
- Extending spaCy’s pipeline and tokens
- Matching words to topics and ratings
- Combining multiple words to a rating
Code examples are introduced and explained using a Jupyter notebook that can be used as basis for your own analysis.
As additional twist the analyzed texts are not in English but German to show that this approach can be used for multiple languages. No knowledge of German is required though because translations of the short examples sentences are provided.