Most NLP and Computer Vision tasks are limited to scarcity of labelled data. In social media emotion classification and other related tasks, hashtags have been used as indicators to label data. With the rapid increase in emoji usage of social media, emojis are used as an additional feature for major social NLP tasks. However, this is less explored in case of multimedia posts on social media where posts are composed of both image and text. At the same time, w.e have seen a surge in the interest to incorporate domain knowledge to improve machine understanding of text. In this paper, we investigate whether domain knowledge for emoji can improve the accuracy of emotion classification task. We exploit the importance of different modalities from social media post for emotion classification task using state-of-the-art deep learning architectures. Our experiments demonstrate that the three modalities (text, emoji and images) encode different information to express emotion and therefore can complement each other. Our results also demonstrate that emoji sense depends on the textual context, and emoji combined with text encodes better information than considered separately. The highest accuracy of 71.98% is achieved with a training data of 550k posts.
Digital Object Identifier (DOI)
Published in Companion Proceedings of The 2019 World Wide Web Conference, 2019, pages 439-449.
Published in WWW2019 Proceedings © 2019 International World Wide Web Conference Committee, published under Creative Commons CC By 4.0 License.
Illendula, A., & Sheth, A. (2019). Multimodal Emotion Classification. Companion Proceedings of The 2019 World Wide Web Conference, 439–449. https://doi.org/10.1145/3308560.3316549