SIGTYP 2021 Workshop @NAACL

June 10, 2021 • Mexico City, Mexico

Co-located with NAACL2021

SIGTYP2021 Proceedings are now available here
Join our RocketChat channel

Important Dates

December 22, 2020: First Call for Workshop Papers
March 1, 2021: Second Call for Workshop Papers
~~March 15, 2021:~~March 29, 2021: Paper submission deadline
April 15, 2021: Notification of acceptance
April 26, 2021: Camera-ready deadline
June 10, 2021: Workshop

Keynote Speakers

Claire Bowern

Miryam de Lhoneux

Johannes Bjerva

David Yarowsky

Workshop Description

Encouraged by the 2019 and 2020 workshops, the aim of the third edition of SIGTYP workshop is to act as a platform and a forum for the exchange of information between typology-related research, multilingual NLP, and other research areas that can lead to the development of truly multilingual NLP methods. The workshop is specifically aimed at raising awareness of linguistic typology and its potential in supporting and widening the global reach of multilingual NLP, as well as at introducing computational approaches to linguistic typology. It will foster research and discussion on open problems, not only within the active community working on cross- and multilingual NLP but also inviting input from leading researchers in linguistic typology. Starting from 2021, the workshop will be dedicated to a shared theme, a central topic that most keynote talks and discussions will be focused on. For instance, in 2021 we would like to follow up a recent debate on linguistic diversity (p-linguistics, the study of individual languages) and universalism (g-linguistics, the study of Human Language), see Haspelmath (2020); Evans and Levinson (2009). The process of annotation of highly cross-lingual corpora (such as recently introduced Universal Dependencies (Nivre et al., 2016) and UniMorph (Sylak-Glassman, 2016)) requires distinguishing language-specific, historically accidental phenomena from truly universal phenomena such as the fact that all languages have demonstratives (Diessel, 2014). Our workshop will serve as a platform to enable fruitful discussions on the topic.

Main Topics

The workshop will provide focussed discussion on a range of topics, including (but not limited to) the following:

• Language-independence in training, architecture design, and hyperparameter tuning. Is it possible (and if yes, how) to unravel unknown biases that hinder the cross-lingual performance of NLP algorithms and to leverage the knowledge on such biases in NLP algorithms?

• Integration of typological features in language transfer and joint multilingual learning. In addition to established techniques such as “selective sharing”, are there alternative ways to encoding heterogeneous external knowledge in machine learning algorithms?

• New applications. The application of typology to currently uncharted territories, i.e. the use typological information in NLP tasks where such information has not been investigated yet.

• Automatic inference of typological features. The pros and cons of existing techniques (e.g. heuristics derived from morphosyntactic annotation, propagation from features of other languages, supervised Bayesian and neural models) and discussion on emerging ones.

• Typology and interpretability. The use of typological knowledge for interpretation of hidden representations of multilingual neural models, multilingual data generation and selection, and typological annotation of texts.

• Improvement and completion of typological databases. Combining linguistic knowledge and automatic data-driven methods towards the joint goal of improving the knowledge on cross-linguistic variation and universals.

• Linguistic diversity and universals. Challenges of cross-lingual annotation. Which linguistic phenomena or categories should be considered universal? How should they be annotated?

Accepted Papers (Archival)

Family of Origin and Family of Choice: Massively Parallel Lexiconized Iterative Pretraining for Severely Low Resource Machine Translation
Zhong Zhou and Alexander Waibel

Measuring Prefixation and Suffixation in the Languages of the World
Harald Hammarström

Predicting and Explaining French Grammatical Gender
Saumya Sahai and Dravyansh Sharma

Morph Call: Probing Morphosyntactic Content of Multilingual Transformers
Vladislav Mikhailov, Oleg Serikov and Ekaterina Artemova

OTEANN: Estimating the Transparency of Orthographies with an Artificial Neural Network
Xavier Marjou

Inferring Morphological Complexity from Syntactic Dependency Networks: a test
Guglielmo Inglese and Luca Brigada Villa

A Universal Dependencies Corpora Maintenance Methodology Using Downstream Application
Ran Iwamoto, Hiroshi Kanayama, Alexandre Rademaker and Takuya Ohko

Improving Cross-Lingual Sentiment Analysis via Conditional Language Adversarial Adaptation
Hemanth Kandula and Bonan Min

Improving the Performance of UDify with Linguistic Typology Knowledge
Chinmay Choudhary

FrameNet and Typology
Michael Ellsworth, Collin Baker and Miriam R L Petruck

Extended Abstracts

Graph Convolutional Network for Swahili News Classification
Alexandros Kastanos and Tyler Martin

Exploring Linguistic Typology Features in Multilingual Machine Translation
Oscar Moreno and Arturo Oncevay

Multilingual Slot and Intent Detection (xSID) with Cross-lingual Auxiliary Tasks
Rob van der Goot, Ibrahim Sharaf, Aizhan Imankulova, Ahmet Üstün, Marija Stepanović, Alan Ramponi, Siti Oryza Khairunnisa, Mamoru Komachi and Barbara Plank

Plugins for Structurally Varied Languages in XMG Framework
Valeria Generalova

Modeling Linguistic Typology - A Probabilistic Graphical Models Approach
Xia Lu

Unsupervised Self-Training for Unsupervised Cross-Lingual Transfer
Akshat Gupta, Sai Krishna Rallabandi and Alan W Black

Let-Mi: An Arabic Levantine Twitter Dataset for Misogynistic Language
Hala Mulki and Bilal Ghanem

Towards Figurative Language Generation in Afrikaans
Imke van Heerden and Anil Bas

Improving access to untranscribed speech by leveraging spoken term detection and self-supervised learning of speech representations
Nay San, Martijn Bartelds and Dan Jurafsky

On the Universality of Lexical Concepts
Bradley Hauer and Grzegorz Kondrak

Quantitative detection of cognacy in the predictive structure of inflection classes: Romance verbal conjugations against the broader typological variation
Borja Herce and Balthasar Bickel

Subword Geometry: Picturing Word Shapes
Olga Sozinova and Tanja Samardzic

A Look to Languages through the Glass of BPE Compression
Ximena Gutierrez-Vasques, Tanja Samardzic and Christian Bentz

Information-Theoretic Characterization of Morphological Fusion
Neil Rathi, Michael Hahn and Richard Futrell

Submission Format

SIGTYP 2021 will consider both archival and non-archival work. We will issue a call for extended abstract submissions (non-archival) and general paper submissions (archival). The accepted submissions will be presented at the workshop, providing new insights and ideas. In terms of non-archival work, we will allow 2-page abstracts of already published work or work in progress. This way, we will not discourage researchers from preferring main conference proceedings, at the same time ensuring that interesting and thought-provoking research is presented at the workshop. In addition, we will consider general archival submissions (4-page and 8-page papers).

Program Committee (Thank you ALL!)

    Željko Agić, Corti
    Emily Ahn, University of Washington
    Isabelle Augenstein, University of Copenhagen
    Emily Bender, University of Washington
    Johannes Bjerva, University of Copenhagen
    Claire Bowern, Yale University
    Miriam Butt, University of Konstanz
    Giuseppe Celano, Leipzig University
    Agnieszka Falenska, University of Stuttgart
    Richard Futrell, University of California, Irvine
    Elisabetta Ježek, University of Pavia
    Gerhard Jäger, University of Tubingen
    John Mansfield, The University of Melbourne
    Paola Merlo, University of Geneva
    Joakim Nivre, Uppsala University
    Robert Östling, Stockholm University
    Thomas Proisl, FAU Erlangen-Nurnberg
    Michael Regan, University of New Mexico
    Ella Rabinovich, University of Toronto
    Tanja Samardžić, University of Zurich
    Richard Sproat, Google Japan
    Sabine Stoll, University of Zurich
    Daan van Esch, Google AI
    Giulia Venturi, ILC ``Antonio Zampolli''
    Nidhi Vyas, Apple
    Ada Wan, University of Zurich
    Eleanor Chodroff, University of York
    Elizabeth Salesky, Johns Hopkins University
    Sabrina Mielke, Johns Hopkins University
    Edoardo Ponti, University of Cambridge
    Damián Blasi, Harvard University
    Adina Williams, Facebook
    Ivan Vulić, the University of Cambridge
    Arturo Oncevay, the University of Edinburgh
    Koel Dutta Chowdhury, Saarland University
    Elena Klyachko, National Research University Higher School of Economics
    Alexey Sorokin, Moscow State UniVersity
    Sylvain Kahane, Université Paris Nanterre
    Taraka Rama, University of North Texas
    Harald Hammarström, Max Planck Institute for the Science of Human History
    Olga Lyashevskaya, National Research University Higher School of Economics
    Kaushal Kumar Maurya, IIT Hyderabad
    Johann-Mattis List, Max Planck Institute for the Science of Human History
    Garrett Nicolai, University of British Columbia
    Yevgeni Berzak, Technion
    Olga Zamaraeva, University of Washington
    Zoey Liu, Boston College
    Jeff Good, University at Buffalo
    Priya Rani, National University of Ireland
    Silvia Luraghi, University of Pavia
    Beata Trawinski, University of Vienna
    Miryam de Lhoneux, University of Copenhagen
    Kemal Kurniawan, University of Melbourne
    Andreas Shcerbakov, University of Melbourne
    Ritesh Kumar, Agra University

Organizers

Ekaterina Vylomova	Elizabeth Salesky	Sabrina Mielke	Gabriella Lapesa	Ritesh Kumar	Edoardo M. Ponti
Harald Hammarström	Eleanor Chodroff	Anna Korhonen	Roi Reichart	Ivan Vulić	Ryan Cotterell