Authorship Attribution on Reviews (CICLING 2016)
The dataset is presented as MySQL tables. You can get the data from the following links:
Amazon Reviews
Yelp Hotel Reviews
Yelp Restaurant Reviews

Author Profiling on Health Forums (LREC 2016)

Note: These scripts used to work for the old version of the forum.  The forum website has since been restructured and they do not work anymore. If you want to use this dataset, please contact us. We are willing to provide any help with it.
The dataset needs to be crawled from the Dailystrength website. The scripts to download the data is here.
The familial token word list we used for this project is here.

EMNLP 2016 Shared Task


A Multi-task Approach to Predict Likability of Books (EACL 2017)
Dataset: Books

Detecting Nastiness in Social Media (ALW1)
The  data is prepared by crawling random users profile page from Each row contains a question and the related answer with their labels. Each label shows if the post is invective or neutral. You can get the data from following link: