Description
Extractive summarization — finding the salient points in a document or corpus — is one of the most fundamental tasks in natural language processing. I’ll show you three ways to do it. One dates back to an IBM Journal article from 1958. One uses topic modeling, a technology from the 2000s. And one uses neural network-derived language embeddings and long short term memory networks — techniques that are only a couple of years old. I’ll explain the algorithms, show code and demos for all three, and I’ll discuss the engineering trade-offs.