Contribute Media
A thank you to everyone who makes this possible: Read More

Multi-lingual Text Data for Sentiment Classification

Description

Labeled Training Datasets or off-the-shelf tools for labeling training datasets for Sentiment Classification are readily available for English and other major languages. For Tagalog and other Filipino languages, however, this is rarely the case. I developed a novel technique that essentially transfers the knowledge of the labeled training dataset on one language (English) into Tagalog and other Filipino languages. This thus enables one to develop Neural Network Sentiment Classification models on several Filipino languages, leveraging on the availability of large amounts of training data on the English language. This is a better approach than manually labeling data on Filipino languages, as it would take too many resources, and Neural Networks need huge amounts of training data in order to be effective.

Details

Improve this page