- Community
- Model
- sentiment-analysis-twitter-roberta-base
sentiment-analysis-twitter-roberta-base
Text sentiment analysis with 3 classes positive, negative, neutral.
Notes
Twitter-roBERTa-base for Sentiment Analysis
This is a roBERTa-base model trained on ~58M tweets and finetuned for sentiment analysis with the TweetEval benchmark. This model is suitable for English.
The output of the text sentiment analysis model is one of three labels denoting the sentiment of the text:
- 0 -> Negative
- 1 -> Neutral
- 2 -> Positive
Pro Tip
Note that the sentiment analysis model is suitable for English text. If you require this workflow on text block in different languages, you create a derived custom workflow and insert a text translation model before the sentiment analysis model. This will ensure that the sentiment analysis model is receiving its input text in English.
More Info
- Repositories:
- Hugging Face docs
Papers
Original
TWEETEVAL: Unified Benchmark and Comparative Evaluation for Tweet Classification
Authors: Francesco Barbieri, Jose Camacho-Collados, Leonardo Neves, Luis Espinosa-Anke
Abstract
The experimental landscape in natural language processing for social media is too fragmented. Each year, new shared tasks and datasets are proposed, ranging from classics like sentiment analysis to irony detection or emoji prediction. Therefore, it is unclear what the current state of the art is, as there is no standardized evaluation protocol, neither a strong set of baselines trained on such domainspecific data. In this paper, we propose a new evaluation framework (TWEETEVAL) consisting of seven heterogeneous Twitter-specific classification tasks. We also provide a strong set of baselines as starting point, and compare different language modeling pre-training strategies. Our initial experiments show the effectiveness of starting off with existing pretrained generic language models, and continue training them on Twitter corpora.
Latest
TimeLMs: Diachronic Language Models from Twitter
Authors: Daniel Loureiro, Francesco Barbieri, Leonardo Neves, Luis Espinosa Anke, Jose Camacho-Collados
Abstract
Despite its importance, the time variable has been largely neglected in the NLP and language model literature. In this paper, we present TimeLMs, a set of language models specialized on diachronic Twitter data. We show that a continual learning strategy contributes to enhancing Twitter-based language models’ capacity to deal with future and out-of-distribution tweets, while making them competitive with standardized and more monolithic benchmarks. We also perform a number of qualitative analyses showing how they cope with trends and peaks in activity involving specific named entities or concept drift.
- ID
- Namesentiment-analysis-twitter-roberta-base
- Model Type IDText Classifier
- DescriptionText sentiment analysis with 3 classes positive, negative, neutral.
- Last UpdatedAug 03, 2022
- PrivacyPUBLIC
- Use Case
- Toolkit
- License
- Share
- Badge