Publications – RiTUAL

Show all

2022

Gustavo Aguilar Shuguang Chen, Anirudh Srinivasan

CALCS 2021 Shared Task: Machine Translation for Code-Switched Data Journal Article

In: arXiv preprint arXiv:2202.09625, 2022.

Abstract | Links | BibTeX | Tags: Code-Switching

2021

Parikh, Dwija; Solorio, Thamar

Normalization and Back-Transliteration for Code-Switched Data Inproceedings

In: Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching, pp. 119–124, ACL, 2021.

Links | BibTeX | Tags: Code-Switching

Shuguang Chen Thamar Solorio, Alan W Black

Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching Conference

Proceedings of the Fifth Workshop on Computational Approaches to Linguistic Code-Switching, 2021.

Abstract | Links | BibTeX | Tags: Code-Switching

2020

Aguilar, Gustavo; Solorio, Thamar

From English to Code-Switching: Transfer Learning with Strong Morphological Clues Conference

The 58th Annual Meeting of the Association for Computational Linguistics, ACL, 2020.

Abstract | Links | BibTeX | Tags: Code-Switching, Transfer learning

Aguilar, Gustavo; Kar, Sudipta; Solorio, Thamar

LinCE: A Centralized Linguistic Code-Switching Evaluation Benchmark Conference

Proceedings of the Twelfth International Conference on Language Resources and Evaluation, LREC, 2020.

Abstract | Links | BibTeX | Tags: benchmark, Code-Switching

@conference{aguilar20_lince,

title = {LinCE: A Centralized Linguistic Code-Switching Evaluation Benchmark},

author = {Gustavo Aguilar and Sudipta Kar and Thamar Solorio},

editor = {LREC},

url = {https://www.aclweb.org/anthology/2020.lrec-1.223.pdf},

year  = {2020},

date = {2020-05-11},

booktitle = {Proceedings of the Twelfth International Conference on Language Resources and Evaluation},

publisher = {LREC},

abstract = {Recent trends in NLP research have raised an interest in linguistic code-switching (CS); modern approaches have been proposed to solve a wide range of NLP tasks on multiple language pairs. Unfortunately, these proposed methods are hardly generalizable to different code-switched languages. In addition, it is unclear whether a model architecture is applicable for a different task while still being compatible with the code-switching setting. This is mainly because of the lack of a centralized benchmark and the sparse corpora that researchers employ based on their specific needs and interests. To facilitate research in this direction, we propose a centralized benchmark for textbf{Lin}guistic textbf{C}ode-switching textbf{E}valuation (textbf{LinCE}) that combines ten corpora covering four different code-switched language pairs (i.e., Spanish-English, Nepali-English, Hindi-English, and Modern Standard Arabic-Egyptian Arabic) and four tasks (i.e., language identification, named entity recognition, part-of-speech tagging, and sentiment analysis). As part of the benchmark centralization effort, we provide an online platform at texttt{ritual.uh.edu/lince}, where researchers can submit their results while comparing with others in real-time. In addition, we provide the scores of different popular models, including LSTM, ELMo, and multilingual BERT so that the NLP community can compare against state-of-the-art systems. LinCE is a continuous effort, and we will expand it with more low-resource languages and tasks.},

keywords = {benchmark, Code-Switching},

pubstate = {published},

tppubtype = {conference}

}

Patwa, Parth; Aguilar, Gustavo; Kar, Sudipta; Pandey, Suraj; PYKL, Srinivas; Gambäck, Björn; Chakraborty, Tanmoy; Solorio, Thamar; Das, Amitava

SemEval-2020 Task 9: Overview of Sentiment Analysis of Code-Mixed Tweets Inproceedings

In: Proceedings of the Fourteenth Workshop on Semantic Evaluation, pp. 774–790, International Committee for Computational Linguistics, Barcelona (online), 2020.

Abstract | Links | BibTeX | Tags: Code-Switching, Sentiment analysis

2018

Suraj Maharjan Deepthi Mave,; Solorio, Thamar

Language Identification and Analysis of Code-Switched Social Media Text Workshop

Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching, ACL 2018, Association for Computational Linguistics, Melbourne, Australia, 2018.

Abstract | Links | BibTeX | Tags: Code-Switching

Aguilar, Gustavo; AlGhamdi, Fahad; Soto, Victor; Diab, Mona; Hirschberg, Julia; Solorio, Thamar

Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task Inproceedings

In: for Computational Linguistics, Association (Ed.): Proceedings of the Third Workshop on Computational Approaches to Linguistic Code-Switching, Association for Computational Linguistics, Melbourne, Australia, 2018.

Abstract | Links | BibTeX | Tags: Code-Switching, English-Spanish, Modern Standard Arabic-Egyptian, NER, shared task, Social Media

2016

Molina, Giovanni; Rey-Villamizar, Nicolas; Solorio, Thamar; AlGhamdi, Fahad; Ghoneim, Mahmoud; Hawwari, Abdelati; Diab, Mona

Overview for the second shared task on language identification in code-switched data Proceeding

Proceedings of the Second Workshop on Computational Approaches to Code Switching; EMNLP 2016, 2016.

Links | BibTeX | Tags: Code-Switching, shared task

AlGhamdi, Fahad; Molina, Giovanni; Diab, Mona; Solorio, Thamar; Hawwari, Abdelati; Soto, Victor; Hirschberg, Julia

Part of Speech Tagging for Code Switched Data Proceeding

Proceedings of the Second Workshop on Computational Approaches to Code Switching; EMNLP, 2016.

Links | BibTeX | Tags: Code-Switching

Samih, Younes; Maharjan, Suraj; Attia, Mohammed; Kallmeyer, Laura; Solorio, Thamar

Multilingual Code-switching Identification via LSTM Recurrent Neural Networks Proceeding

Proceedings of the Second Workshop on Computational Approaches to Code Switching; EMNLP, 2016.

Links | BibTeX | Tags: Code-Switching, CRF, Deeplearning, Neural Networks

2015

Maharjan, Suraj; Blair, Elizabeth; Bethard, Steven; Solorio, Thamar

Developing Language-tagged Corpora for Code-switching Tweets Inproceedings

In: Proceedings of The 9th Linguistic Annotation Workshop, pp. 72–84, Association for Computational Linguistics, Denver, Colorado, USA, 2015.

Links | BibTeX | Tags: Code-Switching

2014

Solorio, Thamar; Blair, Elizabeth; Maharjan, Suraj; Bethard, Steven; Diab, Mona; Gohneim, Mahmoud; Hawwari, Abdelati; AlGhamdi, Fahad; Hirschberg, Julia; Chang, Alison; Fung, Pascale

Overview for the First Shared Task on Language Identification in Code-Switched Data Conference

Proceedings of The First Workshop on Computational Approaches to Code Switching, held in conjunction with EMNLP 2014., ACL, Doha, Qatar, 2014.

Links | BibTeX | Tags: Code-Switching