SIGTYP 2022 Workshop, July 14th @NAACL

July, 14th 2022 • Seattle, WA

Co-located with NAACL2022

SIGTYP2022 Proceedings are now available here
Join our RocketChat channel

To Presentations!

Keynote Speakers

Kristen Howell

Isabel Papadimitriou

Graham Neubig

Workshop Description

Encouraged by the 2019 -- 2021 workshops, the aim of the fourth edition of SIGTYP workshop is to act as a platform and a forum for the exchange of information between typology-related research, multilingual NLP, and other research areas that can lead to the development of truly multilingual NLP methods. The workshop is specifically aimed at raising awareness of linguistic typology and its potential in supporting and widening the global reach of multilingual NLP, as well as at introducing computational approaches to linguistic typology. It will foster research and discussion on open problems, not only within the active community working on cross- and multilingual NLP but also inviting input from leading researchers in linguistic typology. Starting from 2021, the workshop is dedicated to a shared theme, a central topic that most keynote talks and discussions are focused on. For instance, in 2021 we followed up a recent debate on linguistic diversity (p-linguistics, the study of individual languages) and universalism (g-linguistics, the study of Human Language), see Haspelmath (2020); Evans and Levinson (2009). The process of annotation of highly cross-lingual corpora (such as recently introduced Universal Dependencies (Nivre et al., 2016) and UniMorph (Sylak-Glassman, 2016)) requires distinguishing language-specific, historically accidental phenomena from truly universal phenomena such as the fact that all languages have demonstratives (Diessel, 2014). Our workshop will serve as a platform to enable fruitful discussions on the topic.
In 2022, we would like to continue following this direction of research with a special focus on bringing technology to foster documentation of under-described languages.

Main Topics

The workshop will provide focussed discussion on a range of topics, including (but not limited to) the following:

• Integration of typological features in language transfer and joint multilingual learning. In addition to established techniques such as “selective sharing”, are there alternative ways to encoding heterogeneous external knowledge in machine learning algorithms?

• Development of unified taxonomy and resources. uilding universal databases and models to facilitate understanding and processing of diverse languages.

• Automatic inference of typological features. The pros and cons of existing techniques (e.g. heuristics derived from morphosyntactic annotation, propagation from features of other languages, supervised Bayesian and neural models) and discussion on emerging ones.

• Typology and interpretability. The use of typological knowledge for interpretation of hidden representations of multilingual neural models, multilingual data generation and selection, and typological annotation of texts.

• Improvement and completion of typological databases. Combining linguistic knowledge and automatic data-driven methods towards the joint goal of improving the knowledge on cross-linguistic variation and universals.

• Linguistic diversity and universals. Challenges of cross-lingual annotation. Which linguistic phenomena or categories should be considered universal? How should they be annotated?

• Bringing technology to document under-described languages. Improving model performance and documentation of under-resourced languages using typological databases, multilingual models and data from high-resource languages.

Accepted Papers (Archival)

Multilingualism Encourages Recursion: a Transfer Study with mBERT
Andrea Gregor de Varda, Roberto Zamparelli

Word-order Typology in Multilingual BERT: A Case Study in Subordinate-Clause Detection
Dmitry Nikolaev, Sebastian Pado

Typological Word Order Correlations with Logistic Brownian Motion
Kai Hartung, Gerhard Jäger, Munir Georges, Sören Gröttrup

Cross-linguistic Comparison of Linguistic Feature Encoding in BERT Models for Typologically Different Languages
Yulia Otmakhova, Karin Verspoor, Jey Han Lau

Tweaking UD Annotations to Investigate the Placement of Determiners, Quantifiers and Numerals in the Noun Phrase
Luigi Talamo

A Database for Modal Semantic Typology
Qingxia Guo, Nathaniel Imel, Shane Steinert-Threlkeld

Extended Abstracts

PaVeDa - Pavia Verbs Database: Challenges and Perspectives
Chiara Zanchi, Silvia Luraghi, Claudia Roberta Combei

ParaNames: A Massively Multilingual Entity Name Corpus
Jonne Sälevä, Constantine Lignos

Investigating Information-theoretic Properties of the Typology of Spatial Demonstratives
Sihan Chen, Richard Futrell, Kyle Mahowald

How Universal is Metonymy? Results from a Large-Scale Multilingual Analysis
Temuulen Khishigsuren, Gábor Bella, Thomas Brochhagen, Daariimaa Marav, Fausto Giunchiglia, Khuyagbaatar Batsuren

Submission Format

SIGTYP 2021 will consider both archival and non-archival work. We will issue a call for extended abstract submissions (non-archival) and general paper submissions (archival). The accepted submissions will be presented at the workshop, providing new insights and ideas. In terms of non-archival work, we will allow 2-page abstracts of already published work or work in progress. This way, we will not discourage researchers from preferring main conference proceedings, at the same time ensuring that interesting and thought-provoking research is presented at the workshop. In addition, we will consider general archival submissions (4-page and 8-page papers).

Program Committee (THANK YOU!)

    Johannes Bjerva, Aalborg University
    Emily Ahn, University of Washington
    Miriam Butt, University of Konstanz
    John Mansfield, The University of Melbourne
    Daan van Esch, Google AI
    Elisabetta Ježek, University of Pavia
    Paola Merlo, University of Geneva
    Joakim Nivre, Uppsala University
    Robert Östling, Stockholm University
    Ivan Vulić, the University of Cambridge
    Richard Sproat, Google Japan
    Željko Agić, Corti
    Agnieszka Falenska, University of Stuttgart
    Edoardo Maria Ponti, MILA
    Alexey Sorokin, Moscow State University
    Tanja Samardžić, University of Zurich
    Kemal Kurniawan, The University of Melbourne
    Aryaman Arora, Georgetown University
    Samopriya Basu, The University of North Carolina at Chapel Hill
    Badr M. Abdullah, Saarland University
    Guglielmo Inglese, KU Leuven
    Olga Zamaraeva, University of Washington
    Nianwen Xue, Brandeis University
    Borja Herce, The University of Zurich
    Chinmay Choudhary, National University of Ireland, Galway
    Bradley Hauer, The University of Alberta
    Michael Hahn, Stanford University

Organizers

Hila Gonen	Alexey Sorokin	Sabrina Mielke	Gabriella Lapesa	Ritesh Kumar	Edoardo M. Ponti
Harald Hammarström	Andrey Shcherbakov	Pranav A	Jonas Pfeiffer	Ryan Cotterell	Ekaterina Vylomova