Code-Switching Publications

2017

Emre Yılmaz, Jelske Dijkstra, Hans Van de Velde, Frederik Kampstra, Jouke Algra, Henk van den Heuvel and David Van Leeuwen. (2017) Longitudinal Speaker Clustering and Verification Corpus with Code-Switching Frisian-Dutch Speech. Proc. Interspeech 2017, 37-41, DOI: 10.21437/Interspeech.2017-301.

Emre Yılmaz, Henk van den Heuvel and David Van Leeuwen. (2017) Exploiting Untranscribed Broadcast Data for Improved Code-Switching Detection. Proc. Interspeech 2017, 42-46, DOI: 10.21437/Interspeech.2017-391.

Vikram Ramanarayanan & David Suendermann-Oeft. (2017) Jee haan, I’d like both, por favor: Elicitation of a Code-Switched Corpus of Hindi–English and Spanish–English Human–Machine Dialog. Proc. Interspeech 2017, 47-51, DOI: 10.21437/Interspeech.2017-1198.

SaiKrishna Rallabandi, Alan W Black. (2017) On building mixed lingual speech synthesis systems. Proc. Interspeech 2017, 52-56, DOI: 10.21437/Interspeech.2017-1244.

Khyathi Raghavi Chandu, Sai Krishna Rallabandi, Sunayana Sitaram, Alan W Black. (2017) Speech Synthesis for Mixed-Language Navigation Instructions. Proc. Interspeech 2017, 57-61, DOI: 10.21437/Interspeech.2017-1259.

Djegdjiga Amazouz, Martine Adda-Decker, Lori Lamel. (2017) Addressing Code-Switching in French/Algerian Arabic Speech. Proc. Interspeech 2017, 62-66, DOI: 10.21437/Interspeech.2017-1373.

Gualberto Guzmán, Joseph Ricard, Jacqueline Serigos, Barbara E. Bullock, Almeida Jacqueline Toribio. (2017) Metrics for modeling code-switching across corpora. Proc. Interspeech 2017, 67-71, DOI: 10.21437/Interspeech.2017-1429.

Ewald van der Westhuizen, Thomas Niesler. (2017) Synthesising isiZulu-English code-switch bigrams using word embeddings. Proc. Interspeech 2017, 72-76, DOI: 10.21437/Interspeech.2017-1437.

Victor Soto, Julia Hirschberg. (2017) Crowdsourcing Universal Part-Of-Speech Tags for Code-Switching. Proc. Interspeech 2017, 77-81, DOI: 10.21437/Interspeech.2017-1663.

 


2016

E. Yılmaz, H. van den Heuvel and D. van Leeuwen. “Code-switching Detection Using Multilingual DNNs,” in IEEE Workshop on Spoken Language Technology (SLT), pp. 610-616, San Diego, CA, USA, December 2016.

E. Yılmaz, H. van den Heuvel and D. van Leeuwen. “Investigating Bilingual Deep Neural Networks for Automatic Recognition of Code-switching Frisian Speech,” Procedia Computer Science, vol. 81, pp. 159-166, May 2016.

Meng Xuan Xia. Codeswitching language identification using Subword Information Enriched Word Vectors, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 132-136.

Utpal Kumar Sikdar and Björn Gambäck. Language Identification in Code-Switched Text Using Conditional Random Fields and Babelnet, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 127-131.

Prajwol Shrestha. Codeswitching Detection via Lexical Features in Conditional Random Fields, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 121-126.

Rouzbeh Shirvani, Mario Piergallini, Gauri Shankar Gautam and Mohamed Chouikha. The Howard University System Submission for the Shared Task in Language Identification in Spanish-English Codeswitching, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 116-120.

Arunavha Chanda, Dipankar Das and Chandan Mazumdar. Columbia-Jadavpur submission for EMNLP 2016 Code-Switching Workshop Shared Task: System description, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 112-115.

Mohamed Al-Badrashiny and Mona Diab. The George Washington University System for the Code-Switching Workshop Shared Task 2016, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 108-111.

Fahad AlGhamdi, Giovanni Molina, Mona Diab, Thamar Solorio, Abdelati Hawwari, Victor Soto and Julia Hirschberg. Part of Speech Tagging for Code Switched Data, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 98-107. (Presentation)

Souvick Ghosh, Satanu Ghosh and Dipankar Das. Part-of-speech Tagging of Code-Mixed Social Media Text, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 90-97.

Arunavha Chanda, Dipankar Das and Chandan Mazumdar. Unraveling the English-Bengali Code-Mixing Phenomenon, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 80-89.

Meng Xuan Xia and Jackie Chi Kit Cheung. Accurate Pinyin-English Codeswitched Language Identification, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 71-79.

Younes Samih, Wolfgang Maier and Laura Kallmeyer. SAWT: Sequence Annotation Web Tool, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 65-70.

Aaron Jaech, George Mulcaire, Mari Ostendorf and Noah A. Smith. A Neural Model for Language Identification in Code-Switched Tweets, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 60-64. (Presentation)

Younes Samih, Suraj Maharjan, Mohammed Attia, Laura Kallmeyer and Thamar Solorio. Multilingual Code-switching Identification via LSTM Recurrent Neural Networks, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 50-59. (Presentation)

Giovanni Molina, Fahad AlGhamdi, Mahmoud Ghoneim, Abdelati Hawwari, Nicolas Rey-Villamizar, Mona Diab and Thamar Solorio. Overview for the Second Shared Task on Language Identification in Code-Switched Data, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 40-49. (Presentation)

Utsab Barman, Joachim Wagner and Jennifer Foster. Part-of-speech Tagging of Code-mixed Social Media Content: Pipeline, Stacking and Joint Modelling, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 30-39.

Mario Piergallini, Rouzbeh Shirvani, Gauri S. Gautam and Mohamed Chouikha. Word-Level Language Identification and Predicting Codeswitching Points in Swahili-English Language Data, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 21-29. (Presentation)

Gualberto A. Guzman, Jacqueline Serigos, Barbara E. Bullock and Almeida Jacqueline Toribio. Simple Tools for Exploring Variation in Code-switching for Linguists, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 12-20. (Presentation)

Özlem Çetinoğlum Sarah Schulz and Ngoc Thang Vu. Challenges of Computational Processing of Code-Switching, In Proceedings of the Second Workshop on Computational Approaches to Code Switching (EMNLP 2016), Austin, Texas, Association for Computational Linguistics, 1-11. (Presentation)

Arnav Sharma, Sakshi Gupta, Raveesh Motlani, Piyush Bansal, Manish Shrivastava, Radhika Mamidi and Dipti M. Sharma. Shallow Parsing Pipeline – Hindi-English Code-Mixed Social Media Text, In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, California, Association for Computational Linguistics, 1340-1345.

Sunayana Sitaram and Alan W Black. 2016. Speech Synthesis of Code-Mixed Text, In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, May. European Language Resources Association (ELRA).

Rafiya Begum, Kalika Bali, Monojit Choudhury, Koustav Rudra and Niloy Ganguly. 2016. Functions of Code-Switching in Tweets: An Annotation Framework and Some Initial Experiments, In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, May. European Language Resources Association (ELRA).

David Vilares, Miguel A. Alonso and Carlos Gómez-Rodríguez. 2016. EN-ES-CS: An English-Spanish Code-Switching Twitter Corpus for Multilingual Sentiment Analysis, In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, May. European Language Resources Association (ELRA).

Mona Diab, Mahmoud Ghoneim, Abdelati Hawwari, Fahad AlGhamdi, Nada AlMarwani and Mohamed Al-Badrashiny. 2016. Creating a Large Multi-Layered Representational Repository of Linguistic Code Switched Arabic Data, In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, May. European Language Resources Association (ELRA).

Björn Gambäck and Amitava Das. 2016. Comparing the Level of Code-Switching in Corpora, In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, May. European Language Resources Association (ELRA).

Özlem Çetinoğlu. 2016. A Turkish-German Code-Switching Corpus, In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, May. European Language Resources Association (ELRA).

Emre Yilmaz, Maaike Andringa, Sigrid Kingma, Jelske Dijkstra, Frits Van der Kuip, Hans Van de Velde, Frederik Kampstra, Jouke Algra, Henk van den Heuvel and David van Leeuwen. 2016. A Longitudinal Bilingual Frisian-Dutch Radio Broadcast Database Designed for Code-Switching Research, In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, May. European Language Resources Association (ELRA).

Younes Samih, Wolfgang Maier. 2016. An Arabic-Moroccan Darija Code-Switched Corpus Authors. In the Tenth International Conference on Language Resources and Evaluation (LREC 2016), Portorož, Slovenia, May. European Language Resources Association (ELRA)


2015

Anupam Jamatia, Björn Gambäck, and Amitava Das. 2015. Part-of-Speech Tagging for Code-Mixed English-Hindi Twitter and Facebook Chat Messages. In Proceedings of 10th Recent Advances of Natural Language Processing (RANLP), pages 239-248, Bulgaria, September.


Dan Garrette, Hannah Alpert-Abrams, Taylor Berg-Kirkpatrick, and Dan Klein. 2015. Unsupervised Code-Switching for Multilingual Historical Document Transcription. In Human Language Technologies: The 2015 Annual Conference of the North American Chapter of the ACL, pages 1036-1041, Denver, Colorado, May-June. Association for Computational Linguistics.


David Vilares, Miguel A. Alonso, and Carlos Goméz-Rodríguez. 2015. One model, two languages: training bilingual parsers with harmonized treebanks. In arXiv:1507.08449 [cs.CL], July.


David Vilares, Miguel A. Alonso, and Carlos Goméz-Rodríguez. 2015. Sentiment Analysis on Monolingual, Multilingual and Code-Switching Twitter Corpora. In Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis (WASSA 2015), pages 2–8, Lisboa, Portugal, September. Association for Computational Linguistics.


Heike Adel, Ngoc T. Vu, K. Kirchhoff, D. Telaar, and T. Schultz. 2015. Syntactic and Semantic Features for Code-Switching Factored Language Models. In IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 3, pages 431-440, March. TASLP 2015.


Sophia Yat Mei Lee and Zhongqing Wang. 2015. Emotion in Code-switching Texts: Corpus Construction and Analysis. In Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing (SIGHAN-8), pages 91-99, Beijing, China, July. Association for Computational Linguistics and Asian Federation of Natural Language Processing.


2014

Amitava Das and Björn Gambäck. 2014. Code-Mixing in Social Media Text: The Last Language Identification Frontier?. In Traitement Automatique des Langues (TAL): Special Issue on Social Networks and NLP, vol. 54, no. 3/2013, pages 41-64.


Amitava Das and Björn Gambäck. 2014. Identifying Languages at the Word Level in Code-Mixed Indian Social Media Text. In the 11th International Conference on Natural Language Processing (ICON-2014), Goa, India, December.


Björn Gambäck and Amitava Das. 2014. On Measuring the Complexity of Code-Mixing. In the Workshop on Language Technologies for Indian Social Media (सOCIAL-ईNDIA 2014), 11th International Conference on Natural Language Processing (ICON-2014), pages 1-7, Goa, India, December.


Chu-Cheng Lin, Waleed Ammar, Lori Levin, and Chris Dyer. 2014. The CMU Submission for the Shared Task on Language Identification in Code-Switched Data. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 80-86, Doha, Qatar, October. Association for Computational Linguistics.


David Jurgens, Stefan Dimitrov, and Derek Ruths. 2014. Twitter Users #CodeSwitch Hashtags! #MoltoImportante #wow #헐. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 51-61, Doha, Qatar, October. Association for Computational Linguistics.


Evangelos Papalexakis, Dong Nguyen, and A. Seza Doğruöz. 2014. Predicting Code-switching in Multilingual Communication for Immigrant Communities. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 42-50, Doha, Qatar, October. Association for Computational Linguistics.


Fei Huang and Alexander Yates. 2014. Improving Word Alignment Using Linguistic Code Switching Data. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, pages 1–9, Gothenburg, Sweden, April. Association for Computational Linguistics.


Gokul Chittaranjan, Yogarshi Vyas, Kalika Bali, and Monojit Choudhury. 2014. Word-level Language Identification using CRF: Code-switching Shared Task Report of MSR India System. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 73-79, Doha, Qatar, October. Association of Computational Linguistics.


Heba Elfardy, Mohamed Al-Badrashiny, and Mona Diab. 2014. AIDA: Identifying Code Switching in Informal Arabic Text. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 94–101, Doha, Qatar, October. Association for Computational Linguistics.


Igor Labutov and Hod Lipson. 2014. Generating Code-switched Text for Lexical Learning. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, pages 562-571, Baltimore, Maryland, USA, June. Association for Computational Linguistics.


Kalika Bali, Jatin Sharma, Monojit Choudhury, Yogarshi Vyas. 2014. “I am borrowing ya mixing ?” An Analysis of English-Hindi Code Mixing in Facebook. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 116–126, Doha, Qatar, October. Association for Computational Linguistics.


Kfir Bar and Nachum Dershowitz. 2014. The Tel Aviv University System for the Code-Switching Workshop Shared Task. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 139–143, Doha, Qatar, October. Association for Computational Linguistics.


Levi King, E. Baucom, T. Gilmanov, S. Kübler, D. Whyatt, W. Maier, and P. Rodrigues. 2014. The IUCL+ System: Word-Level Language Identification via Extended Markov Models. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 102–106, Doha, Qatar, October. Association for Computational Linguistics.


Marine Carpuat. 2014. Mixed Language and Code-Switching in the Canadian Hansard. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 107-115, Doha, Qatar, October. Association for Computational Linguistics.


Martin Volk and Simon Clematide. 2014. Detecting Code-Switching in a Multilingual Alpine Heritage Corpus. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 24-33, Doha, Qatar, October. Association for Computational Linguistics.


Naman Jain and Riyaz Ahmad Bhat. 2014. Language Identification in Code-Switching Scenario. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 87–93, Doha, Qatar, October. Association for Computational Linguistics.


Nanyun Peng, Yiming Wang, and Mark Dredze. 2014. Learning Polylingual Topic Models from Code-Switched Social Media Documents. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Short Papers), pages 674-679, Baltimore, Maryland, USA, June. Association for Computational Linguistics.


Ngoc Thang Vu and Tanja Schultz. 2014. Exploration of the Impact of Maximum Entropy in Recurrent Neural Network Language Models for Code-Switching Speech. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 34-41, Doha, Qatar, October. Association for Computational Linguistics.


Prajwol Shrestha. 2014. Incremental N-gram Approach for Language Identification in Code-Switched Text. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 133–138, Doha, Qatar, October. Association for Computational Linguistics.


Ramy Eskander, Mohamed Al-Badrashiny, Nizar Habash, and Owen Rambow. 2014. Foreign Words and the Automatic Processing of Arabic Social Media Text Written in Roman Script. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 1-12, Doha, Qatar, October. Association for Computational Linguistics.


Thamar Solorio, E. Blair, S. Maharjan, S. Bethard, M. Diab, M. Ghoneim, A. Hawwari, F. AlGhamdi, J. Hirschberg, A. Chang, and P. Fung. 2014. Overview for the First Shared Task on Language Identification in Code-Switched Data. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 62-72, Doha, Qatar, October. Association for Computational Linguistics.


Utsab Barman, Amitava Das, Joachim Wagner, and Jennifer Foster. 2014. Code-Mixing: A Challenge for Language Identification in the Language of Social Media. In Proceedings of The First Workshop on Computational Approaches to Code Switching, EMNLP 2014, pages 13–23, Doha, Qatar, October. Association for Computational Linguistics.


Utsab Barman, Joachim Wagner, Grzegorz Chrupała, and Jennifer Foster. 2014. DCU-UVT: Word-Level Language Classification with Code-Mixed Data. In Proceedings of The First Workshop on Computational Approaches to Code Switching, pages 127–132, Doha, Qatar, October. Association for Computational Linguistics.


Ying Li and Pascale Fung. 2014. Language Modeling with Functional Head Constraint for Code Switching Speech Recognition. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 907–916, Doha, Qatar, October. Association for Computational Linguistics.


2013

Ben King and Steven Abney. 2013. Labeling the languages of words in mixed-language documents using weakly supervised methods. In Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 1110–1119, Atlanta, Georgia, June. Association for Computational Linguistics.


Constantine Lignos and Mitch Marcus. 2013. Toward web-scale analysis of codeswitching. In the 87th Annual Meeting of the Linguistic Society of America.


Dong Nguyen and A. Seza Doğruöz. 2013. Word level language identification in online multilingual communication. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages 857–862, Seattle, Washington, USA, October. Association for Computational Linguistics.


2012

Nathaniel Oco and Rachel Edita Roxas. 2016. Pattern Matching Refinements to Dictionary-Based Code-Switching Point Detection, In Proceedings of the 26th Pacific Asia Conference on Language, Information, and Computation.

Ying Li and Pascale Fung. 2012. Code-switch Language Model with Inversion Constraints for Mixed Language Speech Recognition. In Proceedings of COLING 2012: Technical Papers, pages 1671–1680, Mumbai, December. COLING 2012.


Ying Li, Yue Yu, and Pascale Fung. 2012. A Mandarin-English code-switching corpus. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC-2012), pages 2515–2519, Istanbul, Turkey, May. European Language Resources Association (ELRA).


2010

D.C. Lyu, T.P. Tan, E. Chng, and H. Li. 2010. Seame: a mandarin-english code-switching speech corpus in south-east asia. In INTERSPEECH, vol. 10, pages 1986–1989.


2008

Thamar Solorio and Yang Liu. 2008a. Learning to predict code-switching points. In Empirical Methods on Natural Language Processing, EMNLP-2008, pages 973–981, Honolulu, Hawaii, October. Association for Computational Linguistics.


Thamar Solorio and Yang Liu. 2008b. Part-of-speech tagging for English-Spanish code-switched text. In Empirical Methods on Natural Language Processing, EMNLP-2008, pages 1051–1060, Honolulu, Hawaii, October. Association for Computational Linguistics.


2007

Anil Kumar Singh and Jagadeesh Gorla. 2007. Identification of languages and encodings in a multilingual document. In Proceedings of ACL-SIGWAC’s Web As Corpus3, Belgium.


2005

Paul McNamee. 2005. Language identification: A solved problem suitable for undergraduate instruction. In the Journal of Computing Sciences in Colleges, vol. 20, no. 3, pages 94–101, February.


1982

A. Joshi. 1982. Processing of sentences with intrasentential code-switching. In COLING-82, pages 145–150, Prague, July.


1980

S. Poplack. 1980. Sometimes I’ll start a sentence in Spanish y termino en español: toward a typology of code-switching. Linguistics, 18(7/8):581–618.


1978

J. Lipski. 1978. Code-switching and the problem of bilingual competence. In Aspects of bilingualism, pages 250–264. Hornbeam.