détection automatique de l'ironie
détection automatique de l'ironie

Application to opinion mining in microblogs and social media In this technical update, Jihen Karoui presents her recently published book.

Sentiment analysis

Sentiment analysis is an extremely active field of research in Automatic Language Processing (ALP). In recent years we have seen an exponential increase in textual data sources of opinion available on the web: web user reviews, forums, social networks, consumer surveys, etc. Faced with this abundance of data, the automated synthesis of multiple opinions is becoming crucial to effectively obtaining an overview of all the opinions on a given subject.

Since the 2000s, many works have been published on the topic, making opinion mining a very active field in ALP research. Overall, current systems have obtained good results in automatically classifying whether a document is subjective or objective.

However, those obtained on the task of polarity analysis (which consists in ranking the document on a subjectivity scale from the most positive to the most negative) remain inconclusive. The main reason for this failure is the incapacity of current algorithms to understand all the subtleties of human language, such as the use of figurative language. Unlike literal language, figurative language uses linguistic mechanisms such as irony, humour, sarcasm, metaphors and analogy which create a difficulty in terms of linguistic representation and the natural processing of figurative language. In this work, we focus on irony and sarcasm in a particular type of data, namely tweets.

Within this framework, we propose a supervised learning-based approach to predict whether a tweet is ironic or not. To do wo, we followed a three-stage approach. Firstly, we were interested in analysing the pragmatic phenomena used to express irony, drawing inspiration from linguistic works to define a multi-level annotation schema for irony. This annotation schema was used as part of campaign to annotate a corpus of 2,000 French tweets. In the second stage, using all the observations made on the annotated corpus, we developed an automatic irony detection model for tweets in French, which uses both the internal context of the tweet through lexical and semantic traits and the external context, searching for information available on the web.

Finally, in the third stage, we studied the portability of the model for detecting irony within a multilingual framework (Italian, English and Arabic). We tested the performance of the annotation schema proposed on Italian and English and we tested the performance of the traits-based automatic detection model on the Arabic language. The results obtained for this extremely complex task are very encouraging and are an avenue to explore to improve polarity detection in sentiment analysis.

Context and motivations

These days, the Web has become an essential source of information thanks to the quantity and diversity of textual content providing opinions expressed by web users. There is a large variety of this content: blogs, comments, forums, social networks, reactions or opinions, which are more and more centralised by search engines. Faced with this abundance of data and sources, the development of tools to extract, summarise and compare the opinions expressed on a given topic has become crucial. This type of tool has considerable benefit, for companies who are looking for customer feedback on their products or their brand image and for individuals who want to find information about a purchase, an outing or a trip.

Currently, market research firms are also interested in these tools to assess a product on the market or to predict results at presidential elections for example.

In this context, sentiment analysis or opinion mining emerged. The first research works for automatic opinion mining date back to the 1990s with, in particular the works of Hatzivassiloglou and McKeown which address the determination and polarity of adjectives, and those of Pang et al. and Littman and Turney on the classification of documents according to their positive or negative polarity. Since the 2000s, many works have been published on the topic, making opinion mining a very active field in Automatic Language Processing (ALP) research. Many assessment campaigns are also dedicated to this topic, such as the TREC (Text REtrieval Conference) campaign, the DEFT (Défi Fouille de Textes - Text Mining Challenge) campaign for French, first carried out in 2005, and the SemEval (Semantic Evaluation) campaign, first carried out in 1998.

Overall, the current systems have obtained good results in the task of subjectivity analysis, which consists in determining whether a portion of text conveys an opinion (i.e. is subjective) or only presents facts (i.e. is objective) (cf. the works of P. D. Turney). For example, the use of subjectivity lexicons potentially paired with classification techniques enables us to detect that the author is expressing a positive opinion towards the prime minister in sentence (1) (via the use of the adjective excellent of positive polarity).

(1) The prime minister gave an excellent speech.

However, the results of opinion analysis systems on the task of polarity analysis, which consists in determining the overall polarity and/or the score of the opinion effectively conveyed by a portion of text that we know to be subjective, still remain inconclusive. The three examples below, taken from the works of F. Benamara, perfectly illustrate the difficulty of the task:

(2) [I bought a second-hand iPhone 5s three months ago.]P1 [The image quality is outstanding.]P2 [However, the tempered glass protection is not good quality]P3 [and the battery gave up on me after 15 days! !]P4

Example (2) contains four clauses, surrounded by brackets. Only the last three contain opinions (in bold). Of these opinions, the first two are explicit, meaning identifiable by subjective words, symbols or expressions in the language, such as the adjective outstanding.

The last, however, is implicit, because it is based on words or groups of words that describe a situation (fact or state) that is deemed desirable or undesirable based on cultural and/or pragmatic knowledge shared by the author and readers.

Comments (3) and (4) below, where the author uses figurative language to express their opinion, also illustrate the difficulty of the task of polarity analysis. The latter express negative opinions although the authors use positive opinion words (love, thank you, wonderful).

(3) I love how your product breaks down as soon as I need it.

(4) Thank you once again, SNCF. It's going to be a wonderful day yet again.

Sometimes, implicit opinions can be expressed ironically, which complicates the polarity analysis even further. In tweet (5), taken from the FrIC corpus, the user uses a false assertion (highlighted text), which makes this message very negative towards Valls. Here we notice the use of the hashtag #irony which helps the reader understand that the message is ironic.

(5) #Valls learnt about #Sarkozy being tapped by reading the newspaper. Luckily, he's not the minister of the interior #irony

It is however important to note that although opinion mining in these examples is of almost childlike simplicity for a human, its automatic mining is extremely complex for a computer programme.

Beyond determining subjective expressions in language, the problem of distinguishing between explicit/implicit opinions or even identifying the use of figurative language is unresolved due to the incapacity of current systems to grasp the context in which opinions are given.

In this work, we will work on the automatic detection of figurative language, a linguistic phenomenon that is extremely present in messages posted on social networks. For a few years, the detection of this phenomenon has become an extremely active research topic in ALP, mainly due to its importance in improving the performance of opinion analysis systems.

Towards the detection of figurative language

Unlike literal language, figurative language distorts meaning to give a figurative or pictorial meaning, such as metaphor, irony, sarcasm, satire and humour. Irony is a complex phenomenon studied broadly in philosophy and linguistics (see the works of D. Sperber and D. Wilson and A. Utsumi). Overall, irony is defined as a figure of rhetoric through which we say the opposite of what we mean (see examples 3 and 4). In computational linguistics, irony is a generic term used to refer to a series of figurative phenomena including sarcasm, even if the latter is expressed with more bitterness and aggression (cf. see the works of R. Clift).

Each type of figurative language has its own linguistic mechanisms that enable us to understand the figurative meaning. Inversion of the reality/truth to express irony, the presence of funny effects to express humour, etc. In most cases, all figurative phenomena need the context of the statement so that the reader or person listening is able to interpret the figurative meaning of a given statement.

Consequently, it is important to be able to infer information beyond lexical, syntactic or even semantic effects in a text. These inferences may vary according to the speaker's profile (such as gender) or their cultural context.

Most works on irony detection in ALP concern corpuses of tweets because authors may explicitly indicate that their messages are ironic by using specific hashtags, such as #sarcasm, #irony, #humour. These hashtags are then used to gather a manually-annotated corpus, an essential resource for the supervised classification of tweets as ironic or not ironic. State-of-the-art works mainly concern tweets in English, but works also exist for the detection of irony and/or sarcasm in ItalianChinese or even Dutch.

Overall, the approaches that have been proposed are almost exclusively based on exploitation of the linguistic content of the tweet. Two main families of indicators have been used:

  • Lexical indicators (n-grammes, number of works, presence of opinion words or expressions of emotion) and/or stylistic indicators (presence of emoticons, interjections, quotes, use of slang, word repetition)
  • Pragmatic indicators to capture the context required to infer irony. These indicators are however extracted from the linguistic content of the message, such as a sudden change in verb tense, the use of semantically distant words, or the use of frequent words vs rare words.

These approaches have obtained encouraging results (E.g. Reyes et al. obtained 79% precision for English tweets. See chapter 2 of the book for a detailed state-of-the-art review and results of existing approaches). We do however think that this type of approach, although it is essential, is only a first step and that it vital to go further by proposing more pragmatic approaches that enable us to infer the extra-linguistic context necessary to understand this complex phenomenon.


Within this framework, we are focusing for the first time on tweets in French, and propose a supervised learning-based approach to predict whether a tweet is ironic or not. Our contributions can be summarised in three main points.

  1. A conceptual model to grasp the pragmatic phenomena used to express irony in messages posted on Twitter.

Drawing inspiration from linguistic works on irony, we propose the first multi-level annotation schema for irony. This schema, published in the workshop ColTal@TALN2016, was used as part of a campaign to annotate a corpus of 2,000 French tweets. An extended version of this corpus was used as training data as part of the first assessment campaign on opinion analysis and figurative language DEFT@TALN 2017. The annotation schema as well as the quantitative and qualitative results of the annotation campaign are described in chapter 3 of the book.

      2. A computational model to infer the pragmatic context necessary for detecting irony.

Using all the observations made on the annotated corpus, we developed an automatic irony detection model for tweets in French, which uses both the internal context of the tweet through lexical and semantic traits and the external context, searching for information available in reliable external sources. Our model enables us, in particular, to detect irony which is manifest in false assertions (see example (5)). This model, which was published at TALN 2015 and ACL 2015, is presented in chapter 4 of the book.

      3. Study of the portability of both the conceptual and computational model to detect irony in a multilingual framework.

We first tested the portability of our annotation schema on tweets in Italian and English, two Indo-European languages that are culturally close to French. Our results, which were published at EACL 2017, show that our schema applies perfectly to these languages. We then tested the portability of our computational model for Arabic where tweets are both written in standard Arabic and Arabic dialect. Our results show that our model, once again, holds up well to a different language family. The portability of our models is discussed in chapter 5 of the book.


Within the same framework, the R&D centre of the company AUSY develops solutions using artificial intelligence, in particular automatic language processing, to offer innovative and intelligent applications to users.

about the author :
Jihen Karoui
Jihen Karoui


practice leader ai & big data

Jihen Karoui is a PhD in Artificial Intelligence, specialized in automatic language processing. Her research interests include the understanding and interpretation of human language in both written and spoken form. She is particularly interested in the analysis and processing of all types of textual and visual data for the development of intelligent solutions. She has co-authored two scientific books.