Expressiveness varies from one person to another. Most images posted on Twitter lack good labels and the accompanying tweets have a lot of noise. Hence, in this paper we identify the contents and sentiments in images through the fusion of both image and text features. We leverage on the fact that AlexNet is a pre-trained model with great performance in image classification and the corresponding set of images are extracted from the web. In particular, we present a novel method to extract features from Twitter images and the corresponding labels or tweets using deep convolutional neural networks trained on Twitter data. We consider fine tuning AlexNet pre-trained CNNs to initialize the model and AffectiveSpace of English concepts as text features. Lastly, to combine the image and text predictions we propose a novel sentiment score. Our model is evaluated on Twitter dataset of images and corresponding labels and tweets. We show that accuracy by merging scores from text and image models is higher than using any one system alone.

Text-Image Sentiment Analysis

Ragusa E.;Zunino R.
2023-01-01

Abstract

Expressiveness varies from one person to another. Most images posted on Twitter lack good labels and the accompanying tweets have a lot of noise. Hence, in this paper we identify the contents and sentiments in images through the fusion of both image and text features. We leverage on the fact that AlexNet is a pre-trained model with great performance in image classification and the corresponding set of images are extracted from the web. In particular, we present a novel method to extract features from Twitter images and the corresponding labels or tweets using deep convolutional neural networks trained on Twitter data. We consider fine tuning AlexNet pre-trained CNNs to initialize the model and AffectiveSpace of English concepts as text features. Lastly, to combine the image and text predictions we propose a novel sentiment score. Our model is evaluated on Twitter dataset of images and corresponding labels and tweets. We show that accuracy by merging scores from text and image models is higher than using any one system alone.
2023
978-3-031-23803-1
978-3-031-23804-8
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1141939
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? ND
social impact