machine learning

Incremental Word Vectors for Time-Evolving Sentiment Lexicon Induction

Abstract

A sentiment lexicon is a list of expressions annotated according to affect categories such as positive, negative, anger and fear. Lexicons are widely used in sentiment classification of tweets, especially when labeled messages are scarce. Sentiment lexicons are prone to obsolescence due to: 1) the arrival of new sentiment-conveying expressions such as #trumpwall and #PrayForParis and 2) temporal changes in sentiment patterns of words (e.g., a scandal associated with an entity). In this paper, we propose a methodology for automatically inducing continuously updated sentiment lexicons from Twitter streams by training incremental word sentiment classifiers from time-evolving distributional word vectors. We experiment with various sketching techniques for efficiently building incremental word context matrices and study how the lexicon adapts to drastic changes in the sentiment pattern. Change is simulated by randomly picking some words from a testing partition of words and swapping their context with the context of words exhibiting the opposite sentiment. Our experimental results show that our approach allows for successfully tracking of the sentiment of words over time even when drastic change is induced.

Authors -

Felipe Bravo-Marquez (University of Chile)
Bernhard Pfahringer (The University of Waikato)
Arun Khanchandani

Access the Paper on -