SemEval 2020 - Task 10:

Emphasis Selection For Written Text in Visual Media

Visual communication relies heavily on images and short texts. Whether it is flyers, posters, ads, social media posts or motivational messages, it is usually highly designed to grab a viewer’s attention and convey a message in the most efficient way. For text, word emphasis is used to better capture the intent, removing the ambiguity that may exist in plain text.
Word Emphasis can clarify or even change the meaning of a sentence by drawing attention to some specific information, and it can be done with Colors, Backgrounds, or Fonts, Italic and Boldface.

Our shared task is designed to invite research in this area. We are expecting to see a variety of traditional and modern NLP techniques to model emphasis. Whether you are an expert or new in Natural Language Processing, we encourage you to participate in this fun new task. 


The purpose of this shared task is to design automatic methods for emphasis selection, i.e. choosing candidates for emphasis in short written text, to enable automated design assistance in authoring.


Spark dataset: This dataset is collected from Adobe Spark and is a collection of short texts containing a variety of subjects featured in flyers, posters, advertisements or motivational memes on social media and contains 1,200 instances.

Quotes dataset: This dataset is a collection of quotes from well-known authors collected from Wisdom Quotes which contains 2,710 instances.


  • No additional context from the user or the rest of the design such as background image is provided.
  • The datasets contain very short texts, usually fewer than 10 words. 
  • Word emphasis patterns are author- and domain-specific. Without knowing the author’s intent and only considering the input text, multiple emphasis selections are valid. A good model, however, should be able to capture the inter-subjectivity or common sense within the given annotations and finally label words according to higher agreements.

Important Notes

  • We will announce the best paper award for each of the following categories:
    • The winner(s) of the task – based on the evaluation metric (Ranking)
    • The best system description paper (best results interpretation)
    • The best negative results paper
  • We encourage all teams to describe their submission in a SemEval-2020 paper (ACL format), including teams with negative results .
  • We encourage all teams to open source their implementations.

Important  Dates

Trial data ready July 31, 2019
Training data ready September 4, 2019
Test data ready December 3, 2019
Evaluation start January 10, 2020
Evaluation end January 31, 2020
Paper submission due February 23, 2020
Notification to authors March 29, 2020
Camera ready due April 5, 2020
SemEval workshop Summer 2020

Register and Participate


Get started by filling out this form and then register your team at Codalab here.
You can now download the dataset and evaluation script.

Feel free join the Google group for task-related news and discussions:


“Learning Emphasis Selection for Written Text in Visual Media from Crowd-Sourced Label Distributions”, 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019)