Computational Intelligence Lab
Impact of Lexical Normalization on Twitter Sentiment Analysis
In this paper we show the impact of lexical normalization on the performance of different models for the task of Twitter sentiment analysis. We investigated BERT and ALBERT models of various sizes and performed lexical normalization using MoNoise in the default as well as the bad-speller mode. Our findings suggest that the impact of lexical normalization depends on the model architecture as well as the model size and that performing lexical normalization can also hurt performance. It is therefore not possible to give a final recommendation on whether it is advisable to perform lexical normalization prior to performing further data analysis.
The source code of our implementation can be found here and a written report can be found here.