May, 6th 2023 • Dubrovnik, Croatia
Co-located with EACL2023
SIGTYP2023 Proceedings are now available here 
Join our RocketChat channel
Important Dates
   January 9, 2023:  First Call for Workshop Papers 
   January 30, 2023  Second Call for Workshop Papers
   February 13, 2023 February 22, 2023:  Paper submission deadline
   March 13, 2023 March 23, 2023:  Notification of acceptance
    April 4, 2023:  Camera-ready deadline
    May 6, 2023:  Workshop
Keynote Speakers
Workshop Description
Encouraged by the 2019 -- 2022 workshops, the aim of the fifth edition of SIGTYP workshop is to act as a platform and a forum for the exchange of information between typology-related research, multilingual NLP, and other research areas that can lead to the development of truly multilingual NLP methods. The workshop is specifically aimed at raising awareness of linguistic typology and its potential in supporting and widening the global reach of multilingual NLP, as well as at introducing computational approaches to linguistic typology. It will foster research and discussion on open problems, not only within the active community working on cross- and multilingual NLP but also inviting input from leading researchers in linguistic typology. Starting from 2021, the workshop is dedicated to a shared theme, a central topic that most keynote talks and discussions are focused on. For instance, in 2021 we followed up a recent debate on linguistic diversity (p-linguistics, the study of individual languages) and universalism (g-linguistics, the study of Human Language), see Haspelmath  (2020);  Evans and Levinson (2009). The process of annotation of highly cross-lingual corpora (such as recently introduced Universal Dependencies (Nivre et al., 2016) and UniMorph (Sylak-Glassman, 2016)) requires distinguishing language-specific, historically accidental phenomena from truly universal phenomena such as the fact that all languages have demonstratives (Diessel, 2014).  Our workshop will serve as a platform to enable fruitful discussions.   
 In 2023, we would like to continue following this direction of research with a special focus on bringing technology to foster documentation of under-described languages. 
 
Main Topics
The workshop will provide focussed discussion on a range of topics, including (but not limited to) the following:
• Integration of typological features in language transfer and joint multilingual learning. In addition to established techniques such as “selective sharing”, are there alternative ways to encoding heterogeneous external knowledge in machine learning algorithms?
• Development of unified taxonomy and resources. uilding universal databases and models to facilitate understanding and processing of diverse languages.
• Automatic inference of typological features. The pros and cons of existing techniques (e.g. heuristics derived from morphosyntactic annotation, propagation from features of other languages, supervised Bayesian and neural models) and discussion on emerging ones.
• Typology and interpretability. The use of typological knowledge for interpretation of hidden representations of multilingual neural models, multilingual data generation and selection, and typological annotation of texts.
• Improvement and completion of typological databases. Combining linguistic knowledge and automatic data-driven methods towards the joint goal of improving the knowledge on cross-linguistic variation and universals.
• Linguistic diversity and universals. Challenges of cross-lingual annotation. Which linguistic phenomena or categories should be considered universal? How should they be annotated?
• Bringing technology to document under-described languages. Improving model performance and documentation of under-resourced languages using typological databases, multilingual models and data from high-resource languages.
Accepted Papers (Archival)
     You Can Have Your Data and Balance It Too: Towards Balanced and Efficient Multilingual Models
    Tomasz Limisiewicz, Dan Malkin, Gabriel Stanovsky 
     Multilingual End-to-end Dependency Parsing with Linguistic Typology Knowledge
    Chinmay Choudhary, Colm O’riordan 
     Identifying the Correlation Between Language Distance and Cross-Lingual Transfer in a Multilingual Representation Space 
    Fred Philippy, Siwen Guo, Shohreh Haddadan 
    Using Modern Languages to Parse Ancient Ones: a Test on Old English
    Luca Brigada Villa, Martina Giarda 
    The Denglisch Corpus of German-English Code-Switching
    Doreen Osmelak, Shuly Wintner 
    Trimming Phonetic Alignments Improves the Inference of Sound Correspondence Patterns from Multilingual Wordlists
    Frederic Blum, Johann-Mattis List 
    A Crosslinguistic Database for Combinatorial and Semantic Properties of Attitude Predicates
    Deniz Özyıldız, Ciyang Qing, Floris Roelofsen, Maribel Romero, Wataru Uegaki 
    Corpus-based Syntactic Typological Methods for Dependency Parsing Improvement
     Diego Alves, Božo Bekavac, Daniel Zeman, Marko Tadić 
    Cross-lingual Transfer Learning with Persian
    Sepideh Mollanorozy, Marc Tanti, Malvina Nissim
    Information-Theoretic Characterization of Vowel Harmony: A Cross-Linguistic Study on Word Lists
    Julius Steuer, Johann-Mattis List, Badr M. Abdullah, Dietrich Klakow 
    Revisiting Dependency Length and Intervener Complexity Minimisation on a Parallel Corpus in 35 Languages
    Andrew Thomas Dyer 
    Does Topological Ordering of Morphological Segments Reduce Morphological Modeling Complexity? A Preliminary Study on 13 Languages
    Andreas Shcherbakov,  Kat Vylomova 
Shared Task Papers (Archival)
     Findings of the SIGTYP 2023 Shared task on Cognate and Derivative Detection For Low-Resourced Languages
    Priya Rani, Koustava Goswami, Adrian Doyle, Theodorus Fransen, Bernardo Stearns, John P. McCrae 
     ÚFAL Submission for SIGTYP Supervised Cognate Detection Task
    Tomasz Limisiewicz
     CoToHiLi at SIGTYP 2023: Ensemble Models for Cognate and Derivative Words Detection
    Liviu P. Dinu, Ioan-Bogdan Iordache, Ana Sabina Uban
Extended Abstracts
     Multilingual BERT has an accent: Evaluating English Influences on Fluency in Multilingual Models
    Isabel Papadimitriou, Kezia Lopez, Dan Jurafsky 
    Language-agnostic Measures Discriminate Inflection and Derivation
    Coleman Haley, Edoardo Ponti, Sharon Goldwater 
    Grambank’s Typological Advances Support Work on Less-resourced Languages
    Hannah J. Haynie, Damian E Blasi, Hedvig Skirgård, Simon J. Greenhill, Quentin D. Atkinson, Russell Gray 
    Gradual Language Model Adaptation Using Fine-Grained Typology
    Marcell Richard Fekete, Johannes Bjerva 
    On the Nature of Discrete Speech Representations in Multilingual Self-supervised Models
    Badr M. Abdullah, Mohammed Maqsood Shaik, Dietrich Klakow
Papers from EACL Findings
     Cross-Lingual Transfer of Cognitive Processing Complexity 
    Charlotte Pouw, Nora Hollenstein, Lisa Beinborn 
     A Large-Scale Multilingual Study of Visual Constraints on Linguistic Selection of Descriptions 
    Uri Berger, Lea Frermann, Gabriel Stanovsky, Omri Abend
     Evaluating the Diversity, Equity and Inclusion of NLP Technology: A Case Study for Indian Languages 
    Simran Khanuja, Sebastian Ruder, Partha Talukdar
     Does Transliteration Help Multilingual Language Modeling? 
    Ibraheem Muhammad Moosa, Mahmud Elahi Akhter, Ashfia Binte Habib
Submission Format
 SIGTYP 2023  will consider both archival and non-archival work. We will issue a call for extended abstract submissions (non-archival) and general paper submissions (archival). The accepted submissions will be presented at the workshop, providing new insights and ideas. In terms of non-archival work, we will allow 2-page  abstracts of already published work or work in progress. This way, we will not discourage researchers from preferring main conference proceedings, at the same time ensuring that interesting and thought-provoking research is presented at the workshop. 
In addition, we will consider general archival submissions (4-page and 8-page papers). Unlimited additional pages are allowed for the references section in all submission types.
Submissions should be anonymous, without authors or an acknowledgement section; self-citations should appear in third person.
Submissions should follow the EACL 2023 style guidelines: 
https://2023.eacl.org/calls/styles/; both long and short paper submissions must follow the two-column format of ACL proceedings. All submissions must be in PDF format.
These should be submitted via OpenReview:
 https://openreview.net/group?id=eacl.org/EACL/2023/Workshop/SIGTYP
 We are accepting all papers from EACL Findings that are relevant to SIGTYP. Contact us via sigtyp AT gmail DOT com if you would like to present your EACL Findings paper at SIGTYP 2023! 
Program Committee (THANK YOU!)
    Miriam  Butt, University of Konstanz 
    Daan van Esch, Google AI 
    Elisabetta Ježek, University of Pavia 
    Paola Merlo, University of Geneva 
    Joakim Nivre, Uppsala University 
    Robert Östling, Stockholm University 
    Ivan Vulić, the University of Cambridge 
    Richard Sproat, Google Japan 
    Željko Agić, Corti 
    Edoardo Maria Ponti, University of Edinburgh 
    Alexey Sorokin, Moscow State University 
    Tanja Samardžić, University of Zürich 
    Aryaman Arora, Georgetown University 
    Samopriya Basu, The University of North Carolina at Chapel Hill 
    Badr M. Abdullah, Saarland University 
    Guglielmo Inglese, KU Leuven  
    Olga Zamaraeva, University of Washington 
    Borja Herce, The University of Zurich 
    Michael Hahn, Stanford University 
    Giuseppe Celano, Leipzig University  
    Richard Futrell, University of California, Irvine 
    Gerhard Jäger, University of Tübingen 
    Eitan Grossman, Hebrew University of Jerusalem 
    Johann-Mattis List, University of Passau and Max Planck Institute for Evolutionary Anthropology  
    Miryam de Lhoneux, KU Leuven  
    Giulia Venturi, Istituto di Linguistica Computazionale "Antonio Zampolli"  
    Kristen Howell, University of Washington 
    Barend Beekhuizen, University of Toronto  
    Claire Bowern, Yale University 
    Thomas Proisl, FAU Erlangen-Nürnberg 
    Michael Regan, University of Washington  
	
Organizers
| Koustava Goswami | Alexey Sorokin | Ritesh Kumar | Andrey Shcherbakov | Edoardo M. Ponti | 
|---|---|---|---|---|
| Saliha Muradoğlu | Lisa Beinborn | Ryan Cotterell | Ekaterina Vylomova | 
Sponsors
 
Interested in being a Sponsor? Contact us!


