Laboratory seminars

Seminar schedule 2023

14 March

George Moroz, Chiara Naccarato, Anastasia Yakovleva, Svetlana Zemicheva (HSE)

Non-standard features in spoken corpora of dialectal and regional varieties of Russian

Abstract

In this talk, we will present the preliminary results of ongoing research on variation in spoken corpora of dialectal and regional varieties of Russian. The collection of spoken corpora developed by members of the Linguistic Convergence Laboratory is constantly growing and currently includes 21 corpora of bilinguals’ and dialectal varieties of Russian. Based on data from such corpora, we investigate phenomena of variation at different linguistic levels. Some of the topics currently being investigated include preposition drop, case marking in numeral constructions, and functions of participles and converbs. After manual annotation of corpus data, we apply statistical methods to test linguistic and sociolinguistic hypotheses on the motivations of certain patterns of variation.

Svetlana Kuznetsova (HSE)

Reading group: A sampling technique for worldwide comparisons of language contact scenarios by Francesca Di Garbo and Ricardo Napoleão de Souza

Abstract

Existing sampling methods in language typology strive to control for areal biases in typological datasets as a means to avoid contact effects in the distribution of linguistic structure. However, none of these methods provide ways to directly compare contact scenarios from a typological perspective. This paper addresses this gap by introducing a sampling procedure for worldwide comparisons of language contact scenarios. The sampling unit consists of sets of three languages. The Focus Language is the language whose structures we examine in search for contact effects; the Neighbor Language is genealogically unrelated to the Focus Language, and counts as the potential source of contact influence on the Focus Language; the Benchmark Language is a relative of the Focus Language neither in contact with the Focus nor with the Neighbor language, and is used for disentangling contact effects from genealogical inheritance in the Focus Language. Through this design, we compiled a sample of 49 three-language sets (147 languages in total), which we present here. By switching the focus of typological sampling from individual languages to contact relations between languages, our method has the potential of uncovering patterns in the diffusion of language structures, and how they vary and change.

7 March

Alina Russkikh (HSE)

Constructions with collective numerals in typological perspective

Abstract

This study represents a typological investigation on constructions with collective numerals based on the data collected from 105 languages. By constructions with collective numerals, I understand such cases where a quantifying group indicates selection of N items out of the set with N items in total, cf. English both, French tous les trois ‘all the three’ or Chuvash ik aʨ-i=de [two child-P_3=ADD] ‘both children’. This study is aimed at identifying possible strategies of forming such construction in the languages of the world, and their distribution. In a number of languages, constructions with collective numerals are formed by using morphological markers or syntactic models which occur in other grammatical functions as well. In such cases, I take into consideration those adjacent functions and their possible semantic relations with a collective meaning in constructions with numerals.

Aigul Zakirova

Number marking on verbs in the East Caucasian languages

Abstract

In this talk I describe different patterns of number marking on verbs, using mainly grammatical descriptions Possible ways to analyze number marking on verbs are discussed: 1) as agreement, i.e. as an operation by which morphological featuresare copied from one word form onto another, 2) as a special category of verbal number. Besides, I dwell on the factors by which this marking is conditioned. As it turns out, the Nakh and the Avar-Ando-Tsez branches feature number marking strategies, which are limited to a lexically defined set of verbs. Number markers in this case are localized in the verb stem. On the other hand, in many branches of the family, a different situation is also common: number marking is conditioned by the TAM-forms of the verb, i.e. by grammatical factors. Finally, the diachrony of number marking is considered. I propose that the verb forms which are marked for number often go back to copular constructions where the predicate position is occupied by a participial form. As a result of grammaticalization of these constructions into verbal forms, the original nominal number marking on the participle develops into verbal number marking.

21 February

Alexey Koshevoy (LPC, Aix-Marseille Université and Institut Jean Nicod, ENS-PSL), Anastasia Panova (Stockholm University) and Ilya Makarchuk (HSE Univeristy)

Building a Universal Dependencies Treebank for a Polysynthetic Language: the Case of Abaza

Abstract

In this talk we are going to discuss the challenges that we faced during the construction of a Universal Dependencies treebank for Abaza, a polysynthetic Northwest Caucasian language. We propose an alternative to the morpheme-level annotation of polysynthetic languages introduced in Park et al. (2021). Our approach aims at reducing the number of morphological features, yet providing all the necessary information for the comprehensive representation of all the syntactic relations. Besides, we suggest to add one language-specific relation needed for annotating repetitions in spoken texts and present several solutions that aim at increasing crosslinguistic comparability of our data.

Alina Russkikh (HSE), Maksim Melenchenko (HSE)

Reading group: C. T. Schütze (2011) Linguistic evidence and grammatical theory

Abstract

This article surveys the major kinds of empirical evidence used by linguists, with a particular focus on the relevance of the evidence to the goals of generative grammar. After a background section overviewing the objectives and assumptions of that framework, three broad kinds of data are considered in the three subsequent sections: corpus data, judgment data, and (other) experimental data. The perspective adopted is that all three have their place in the linguist’s toolbox: they have relative advantages and disadvantages that often complement one another, so converging evidence of more than one kind can reasonably be sought in many instances. Points are illustrated mainly with examples from syntax, but often can be easily translated to other levels (e.g., phonology, morphology, semantics, and pragmatics).

14 February

Philip Shushurin (ILS RAS/ Ben Gurion University of the Negev, Beer Sheva)

Nouns, adjectives and other lexical categories

Abstract

Nouns and adjectives are often contrasted with other lexical categories as a category which can get case marking. Detailed typological studies reveal many additional similarities between nouns and adjectives such as the inability to have direct objects (cf. performance (of) the song, full *(of) problems). I propose a new analysis of the traditional syntactic categories N and Adj, suggesting that the principal properties of nouns and adjectives are largely determined by the presence of (valued or unvalued) inherent gender. Furthermore, I show how this system is able to extend to other lexical categories such as prepositions (and linkers) as well as verbs.

7 February

Anastasiya Ivanova (HSE)

Reading group: Hübler N. 2022 Phylogenetic signal and rate of evolutionary change in language structures. R. Soc. Open Sci. 9: 211252.

Abstract

Within linguistics, there is an ongoing debate about whether some language structures remain stable over time, which structures these are and whether they can be used to uncover the relationships between languages. However, there is no consensus on the definition of the term ‘stability’. I define ‘stability’ as a high phylogenetic signal and a low rate of change. I use metric D to measure the phylogenetic signal and Hidden Markov Model to calculate the evolutionary rate for 171 structural features coded for 12 Japonic, 2 Koreanic, 14 Mongolic, 11 Tungusic and 21 Turkic languages. To more deeply investigate the differences in evolutionary dynamics of structural features across areas of grammar, I divide the features into 4 language domains, 13 functional categories and 9 parts of speech. My results suggest that there is a correlation between the phylogenetic signal and evolutionary rate and that, overall, two-thirds of the features have a high phylogenetic signal and over a half of the features evolve at a slow rate. Specifically, argument marking (flagging and indexing), derivation and valency appear to be the most stable functional categories, pronouns and nouns the most stable parts of speech, and phonological and morphological levels the most stable language domains.

Johanna Nichols (HSE)

Work in progress on using typological distributions to identify language family homelands

Abstract

This is a preliminary report applying principles worked out in identifying homelands and centers of dispersal for Uralic and some of its branches (Grünthal et al. 2022) to reconstructing a center and trajectories for the Nakh-Daghestanian dispersal. Selection of typological variables, choice of methods of comparison, and types of coding can all selectively enhance the visibility of starting points or endpoints of dispersal trajectories. For Uralic we can reconstruct a broadly easterly homeland and an east-to-west distribution of the earliest branch ancestors, followed by northward spreads. Applying similar reasoning to Nakh-Daghestanian, there is also a directionality in early branch ancestor locations and diffusions of typological features, and there is geographical and archaeological evidence not available for Uralic. Compared to Uralic, a Nakh-Daghestanian origin appears to be less precise in time but more precise in space. Grünthal, Riho, Volker Heyd, Sampsa Holopainen, Juha Janhunen, Olesya Khanina, Matti Miestamo, Johanna Nichols, Janne Saarikivi, and Kaius Sinnemäki. 2022. Drastic demographic events triggered the Uralic spread. Diachronica 39:4.490–524.https://doi.org/10.1075/dia.20038.gru (open access)

31 January

Konstantin Filatov (HSE), Vladimir Plungian (MSU, IL RAS, RLI, HSE)

New Testament as a parallel corpus, and parallel corpus as a typological data base: a different look

Abstract

The use of parallel texts for typological research is a long-existing practice, significantly enhanced with the advent of corpus technologies (informative surveys of the problem can be found in Cysouw & Wälchli 2007, Aijmer 2008, Frajzyngier & Mettouchi 2015, Doval & Sánchez Nieto 2019, among others). For a wide-scale cross-linguistic study, especially efficient instrument is what is usually called “massively parallel corpus” including sample texts from a very large number of languages (up to several hundred; cf. Östling 2016, Нестеренко 2019). For that purpose, the best choice would be the texts which are not only the most translated but targeting the less studied languages at the same time. Virtually the only candidate that satisfies both conditions is obviously the New Testament: as of now, at least fragments of this text are translated into more than 3,200 languages (including extinct, poorly documented or unwritten ones). Again, its importance for cross-linguistic studies is a well-known fact (first attempts to use New Testament translations going back as early as to the 18th century, as in Gottfried Hensel’s “Synopsis universae philologiae” with the earliest language maps known). In present-day typology, the story begins probably with Haspelmath 1997 and is continued by a number of fairly diverse approaches, as, for example, Barentsen 2008, Wälchli & Cysouw 2012, or Dahl 2014. However, all existing attempts to use New Testament follow roughly the same pattern: they (i) take one lexical or grammatical phenomenon of a presumably universal extent (as verbs of motion or perception, spatial adpositions, markers of current relevance, etc.), then (ii) try to identify cross-linguistically reliable contexts for it and (iii) analyze the variation observed. Here, the focus is on one particular piece of data supposed to be identifiable in many doculects constituting the parallel corpus. What we would like to propose is somewhat different. Our focus is not on a single category, but on the whole set of cross-linguistically relevant values belonging to the Universal Grammatical Inventory. Accordingly, the parallel corpus is viewed not as a mere collection of repeated occurrences, but as a complete database of grammatically relevant contexts where a corresponding grammatical value is most expected. The results of our preliminary research suggest that it is possible to isolate typical contexts for main tense and aspect values (as future and prospective, progressive, iterative and habitual, resultative and framepast), for main argument roles, for number and determination values, etc. Not all grammatical categories can be localized in this way, but only those with a clear semantic prototype, sometimes called “inherent” (cf. Frajzyngier & Mettouchi 2015). Nevertheless, the cases in point are numerous, and we are going to demonstrate some examples of how the relevant grammatical contexts can be profitably studied using the continuous annotation of New Testament parallel corpus. Bibliography Aijmer, K. 2008. Parallel and comparable corpora // A. Lüdeling & M. Kytö (eds.). Corpus Linguistics: An International Handbook. Berlin: De Gruyter Mouton, vol. 1, 275–291. Barentsen, A. 2008. О конструкциях при глаголах восприятия в различных европейских языках (на основе переводов Нового завета) // E. de Haard, W. Honselaar & J. Stelleman (eds.). Literature and Beyond: Festschrift for Willem G. Weststeijn on the Occasion of his 65th Birthday. Amsterdam: Pegasus, vol. 1, 103–134. Cysouw, M. & Wälchli, B. (eds.). 2007. Parallel Texts: Using Translational Equivalents in Linguistic Typology // Theme issue in Sprachtypologie & Universalienforschung, 60.2. Dahl, Ö. 2014. The perfect map: Investigating the cross-linguistic distribution of TAME categories in a parallel corpus // B. Szmrecsanyi & B. Wälchli (eds.). Aggregating dialectology and typology: linguistic variation in text and speech, within and across languages. Berlin: De Gruyter, 268–289. Doval, I. & Sánchez Nieto, M. T. (eds.). 2019. Parallel corpora for contrastive and translation studies: New resources and applications. Amsterdam: John Benjamins. Frajzyngier, Z. & Mettouchi, A. 2015. Functional domains and cross-linguistic comparability // A. Mettouchi, M. Vanhove & D. Caubet (eds.). Corpus-based Studies of Lesser-described Languages: The CorpAfroAs corpus of spoken AfroAsiatic languages. Amsterdam: John Benjamins, 257–279. Нестеренко, Л. В. 2019. Мультиязычные параллельные корпуса: новый источник данных для типологических исследований, перспективы использования и проблемы // Вопросы языкознания (2), 111–125. Östling, R. 2016. Studying colexification through massively parallell corpora // P. Juvonen & M. Koptjevskaja-Tamm (eds.). The lexical typology of semantic shifts. Berlin: De Gruyter, 157–176. Wälchli, B. & Cysouw, M. 2012. Lexical typology through similarity semantics: Toward a semantic map of motion verbs // Linguistics 50.3, 671–710.

K. Filatov

Reading group: Dahl, Ö. & B. Wälchli. 2016. Perfects and iamitives: two gram types in one grammatical space. Letras de Hoje 51(3). 325. https://doi.org/10.15448/1984-7726.2016.3.25454.

Abstract

This paper investigates the grammatical space of the two gram types – perfects and iamitives. Iamitives (from Latin iam ‘already’) overlap in their use with perfects but differ in that they can combine with stative predicates to express a state that holds at reference time. Iamitives differ from ‘already’ in having a higher frequency and showing a strong tendency to be grammaticalized with natural development predicates. We argue that iamitives can grammaticalize from expressions for ‘already’. In this study, we extract perfect grams and iamitive grams iteratively starting with two groups of seed grams from a parallel text corpus (the New Testament) in 1107 languages. We then construct a grammatical space of the union of 370 extracted grams by means of Multidimensional Scaling. This grammatical space of perfects and iamitives turns out to be a continuum without sharp boundaries anywhere.

24 January

Polina Nasledskova (HSE)

Ordinal numerals in sign languages

Abstract

This is a continuation of my typological research of formation of ordinal numerals. Sign languages can provide valuable insights to typological generalizations. In this talk, I am going to compare the data from sign languages with my earlier observations based on languages from WALS-100 sample. Despite the fact that ordinal numerals in sign languages are in many respects similar to those of languages from WALS-100, some crucial differences between the two samples are also attested.

Daria Ryzhova (HSE)

Reading Group: Margetts, A., Haude, K., Himmelmann, N. P., Jung, D., Riesberg, S., Schnell, S., Seifart F., Sheppard H., Wegener, C. (2022). Cross-linguistic patterns in the lexicalisation of bring and take. Studies in Language. International Journal sponsored by the Foundation “Foundations of Language”, 46(4), 934-993.

Abstract

This study investigates the linguistic expression of bring and take events and more generally of the semantic domain of directed caused accompanied motion (‘directed CAM’) across a sample of eight languages of the Pacific and the Americas. Unlike English, the majority of languages in our sample do not lexicalise directed CAM events by simple verbs, but rather encode the defining meaning components – caused motion, accompaniment, and directedness – in morphosyntactically complex constructions. The study shows a high degree of crosslinguistic diversity, even among closely related languages. Meaning components are contributed to directed CAM expressions by a mix of lexical semantics, morphosyntax, and pragmatic means. The study proposes a text-based, semantic typology of directed CAM events by drawing on corpus data from endangered languages.

17 January

Elena Shvedova (HSE)

An outline for studying the system of verbal patterns in Urmi Neo-Aramaic

Abstract

There is some general idea of what Semitic verbal “patterns” (also called templates, binyanim, породы) are — not only among semitists, but also among linguists. The expected canonical verbal root in Semitic languages is purely consonantal, and inflected verb stems are created through a restricted number of derivational templates. These templates encode in a semi-systematic manner such dimensions of verb meaning as agency and voice. It is well described that verbal patterns can vary between languages — usually the variability of the number of the patterns, their individual form and function are mentioned. However, it is assumed that the template systems have more or less the same morphosyntactic status in all Semitic languages. The main task of my future research is to determine the status of verbal patterns in the system of Christian Urmi Neo-Aramaic (< Northeastern Neo-Aramaic < Semitic), a language with a significantly reduced, “non-classical” Semitic verbal system. On the one hand, three Urmian verbal patterns just mark the inflection-class membership of the verb. On the other hand, for some verbs the change of the pattern can be described as a morphological mechanism of changing valency. I will discuss the problems of establishing consonantal root in Christian Urmi, regularity of the patterns’ meanings and possible directions of my future work. Reading Group: Schreur, J. W., Allassonnière-Tang, M., Bellamy, K., & Rochant, N. (2022). Predicting grammatical gender in Nakh languages: Three methods compared. Linguistic Typology at the Crossroads, 2(2), 93-126.Kirill Chuprinko (HSE)

Seminar schedule 2022

20 December

Svetlana Zemicheva (HSE)

Tomsk dialect corpus as a comprehensively annotated resource

Abstract

Tomsk dialect corpus (https://losl.tsu.ru/losl_search) is the biggest Russian dialect corpus with different types of annotation. It is based on the recordings of Russian dialect speech which were made in dialectological expeditions along the Middle Ob River (Tomsk and Kemerovo regions, West Siberia) from 1946 to the present (more than 400 settlements of the region were surveyed). In this talk I will present the results of three-years project devoted to creating the corpus. I will characterize its materials by touching on the issues of balance and representativeness and describe corpus design in details. The corpus as a comprehensively annotated resource includes 3 modules: 1) textual – annotation and search by a) extralinguistic parameters (year & place of the recording; informant’s sex, age, educational level) b) texts parameters (topic and genre); 2) grammatical – annotation and search by morphological parameters; 3) lexicographical – definitions of dialect lexemes. Also I will present several case studies to demonstrate how this new electronic resource can be used in research practice.

13 December

Maria Brykina (The University of Hamburg), Josefina Budzisch (The University of Hamburg), Sergei V. Kovylin (Laboratory «Linguistic Platforms» Ivannikov Institute for System Programming of the RAS, Moscow; Tomsk State Pedagogical University)

A speaker-oriented study of dialectal features in Selkup

Abstract

Selkup is a Samoyedic (< Uralic) language known for numerous dialects and subdialects, and a lack of sharp boundaries between them (cf. Kazakevich 2022, Klump & Budzisch, forthc.). Our study aims at giving a new perspective on Selkup dialectal distribution.We use phonetic, morphological and lexical features to compare the speech of several dozens of speakers on the basis of quantitative data acquired from several corpora. In our talk we show how such data helps to improve our knowledge about individual features and present some preliminary results of speaker clusterization, comparing them to existing dialectal classifications of Selkup. We will also discuss methodological problems that we had to face when dealing with missing values, heterogeneous features and a small amount of data for some speakers.

Konstantin Filatov (HSE)

Towards a working definition of “verbal grammatical system”

Abstract

This talk is part of my project on the diachrony of verbal grammatical systems in Andic languages (< Avaro-Ando-Tsezic < Nakh-Daghestanian). Before analyzing diachronic scenarions in the domain of verbal grammar, it would be wise to delimit the scope of relevant phenomena, i. e. to define what the verbal systems are. However, in existing literature on verbal categories (including a well-known Dahl 1985’s monograph with the word “system” in the title), a little attention is paid to the core concept, and the term is most often just taken for granted. Combining the approaches that have been advanced by V. A. Plungian (1998, 2011) and V. S. Khrakovsky (1996), I believe that a verbal grammatical system can be profitably described considering at least 5 types of relations between meanings and forms: (i) clustering of meanings — focusing on the internal structure of a polysemous marker (ii) cumulation of meanings — focusing on cumulative / separated expression within grammatical markers (iii) categorization patterns — focusing on how grammatical meanings are allocated to (same or different) categories (iv) degree of morphologization — focusing on morphological / periphrastic continuum (v) morphotactic interactions — focusing on how a meaning can affect the expression of other grammatical meanings within the wordform I will discuss each of these features in detail and exemplify them with facts from Andic languages.

6 December

Masha Kyuseva (Surrey Morphology Group)

Semantic factors in case loss: the Serbian-Bulgarian dialectal continuum

Abstract

Over time there has been a dramatic loss of rich case systems across languages of Europe. The analysis of historical texts has revealed the general picture about how this process occurred, yet the details of how it was implemented largely elude us. In particular, what happens to the case meanings when the morphological form falls out of use? Are they all expressed by an alternative form? Do they merge with the meanings of another case? Or is the unity of meanings that was supported by a common inflectional form completely dismantled? To answer these questions, we propose to look at data where this change is still taking place, namely within the South Slavonic dialect continuum formed by Serbian and Bulgarian. We focus on the decline of one case, the instrumental, in its non-prepositionally governed uses. Our analysis shows that the meanings of the instrumental are not covered by one alternative means of expression, but are split over a number of different prepositional constructions. The choice of prepositions is not random and is largely determined by functions of the original case form. This suggests that this case has no unified meaning (contra Jakobson 1936), and behaves more like a contingent cluster of functions. Reading Group: Poplack, S., & Levey, S. (2010). Contact-induced grammatical change: A cautionary tale. Language and space: An international handbook of linguistic variation, 1, 391-419.Chiara Naccarato (HSE)

29 November

A. Russkikh (HSE)

Functions of additive particle =lo in Zilo Andi (in the wake of the fieldtrip in September 2022)

Abstract

In this talk, I am going to discuss the results of my recent fieldtrip to the village of Zilo (Daghestan) aimed at researching the functions of the additive particle =lo in Zilo Andi (< Andic < Nakh-Daghestanian). In existing descriptions (Verhees 2019, Maisak 2021) for other Upper Andi varieties (predominantly of the villages of Andi, Gagatli and Rikvani), it is shown that besides typologically common functions of additive particles (such as additive, scalar additive, concessive, coordination, topic marking, part of indefinite pronouns), the particle =lo can be used as well in typologically less described contexts, such as constructions for collective numerals, converbal clauses, and as a part of the subordinating marker =lodːu and comitative marker -loj. The goal of this talk is to make a description of those contexts in which =lo is attested in Zilo Andi and to understand the semantic contribution of =lo in different functions. Special attention will be paid to the uses of =lo with universal quantifiers and different series of indefinite pronouns. I will also consider the semantics of =lo in combination with other components, especially with the emphatic (or also called antiadditive) particle =gu. Maisak, T. “Endoclitics in Andi” Folia Linguistica, vol. 55, no. 1, 2021, pp. 1-34. Verhees S. General converbs in Andi //Studies in Language. International Journal sponsored by the Foundation “Foundations of Language”, 2019. 43 (1). p. 195

S. Zemicheva (HSE)

Reading Group: Larsson, Egbert, Biber (2022) On the status of statistical reporting versus linguistic description in corpus linguistics: a ten-year perspective

Abstract

This study investigates (i) whether there has been a shift towards increased statistical focus in corpus linguistic research articles, and, if so, (ii) whether this has had any repercussions for the attention paid to linguistic description. We investigate this through an analysis of the relative focus on statistical reporting versus linguistic description in the way the results are reported and discussed in research articles published in four major corpus linguistics journals in 2009 and 2019. The results display a marked change: in 2009, a clear majority of the articles exhibit a preference for linguistic description over statistical reporting; in 2019, the exact opposite is true. The number of different statistical techniques employed has also gone up. Whilst the increased statistical focus may reflect increased methodological sophistication, our results show that it has come at a cost: a diminished focus on linguistic description, evident, for example, through fewer text excerpts and linguistic examples, which appears to be symptomatic of increasing distance from the language that is the object of study. We discuss these shifts and suggest some ways of employing sophisticated statistical techniques without sacrificing a focus on language.

22 November

Chiara Naccarato, George Moroz, Konstantin Filatov, Asya Alekseeva, Anastasiya Ivanova, Maria Godunova, Maksim Melenchenko, Timofey Mukhin, Ilya Sadakov, Elena Shvedova (HSE)

TALD (Typological Atlas of the Languages of Daghestan): Update

Abstract

In this talk we will report on some of the recent updates made to the Typological Atlas of the Languages of Daghestan. We will discuss some of the chapters that were recently uploaded to the website, as well as new topics that are currently being developed. In the final part of the talk we will address some practical questions that are still being solved, and the future steps we intend to take.

15 November

Jesse Wichers Schreur (University of Groningen)

Contact-induced change in Tsova-Tush: a small typology of clause combining

Abstract

Many ‘small’ languages of the Caucasus, especially those in Daghestan, have been known to be relatively stable in the last centuries in terms of their numbers of speakers (Daniel, Chechuro, et al. 2021, p. 523). Other ‘small’ languages, especially those spoken in Georgia and Azerbaijan, have been characterised by heavy language contact or language shift (or sometimes both), such as Khinalug (Rind-Pawlowski, p.c.), Kryz (Authier 2010) and Udi (Gippert 2008). The Nakh language Tsova-Tush has been spoken in Georgian-dominated territory since time immemorial, and shows heavy lexical borrowing (Desheriev 1953; Wichers Schreur 2021). An often noted, but not thoroughly investigated aspect of Georgian linguistic influence is the restructuring of the Tsova-Tush system of clause combining, especially subordination. In this talk, the most common types of Tsova-Tush subordinate clauses will be presented, along with their comparison to Georgian on the one hand, and to other Nakh languages on the other. This will give us the opportunity to hypothesise about the origin of the various subordination strategies, especially taking into account historical Tsova-Tush data. Furthermore, work in progress will be presented on Tsova-Tush ‘cosubordination’ and a possible system of switch-reference marking. Authier, G. (2010). “Azeri morphology in Kryz (East Caucasian)”. In:Turkic languages14, pp. 14–42 Daniel, M., I. Chechuro, et al.(2021). “Lingua francas as lexical donors: evidence from Daghestan”. In: Language97 (3), pp. 520–560 Desheriev, Y. D., [Дешериев] (1953). Bacbijskij jazyk.Fonetika, morfologija, sintaksis, leksika [The Batsbi language. Phonetics, morphology, syntax, lexicon]. Moscow, Leningrad: Akademija Nauk SSSR, Institut Jazykoznanija Gippert, J. (2008). “Endangered Caucasian languages in Georgia. Linguistic parameters of language endangerment”. In:Lessons from documented endangered languages. Ed. by K. D. Harrison, D. S. Rood, and A. Dwyer. Amsterdam: John Benjamins, pp. 159– 194. Wichers Schreur, J. (2021). “Nominal borrowings in Tsova-Tush (Nakh-Daghestanian, Georgia) and their gender assignment”. In:Language contact in the territory of the former Soviet Union. Ed. by D. Forker and L. A. Grenoble. Amsterdam: John Benjamins, pp. 15–33.

Konstantin Filatov (HSE)

Reading group: Kalinina, E., & Sumbatova, N. (2007). Clause structure and verbal forms in Nakh-Daghestanian languages. Finiteness: Theoretical and empirical foundations, 183-249.

Abstract

This chapter addresses finiteness in the languages of the Nakh-Daghestanian (East Caucasian) group. We argue that none of the approaches mentioned above yields satisfactory results when applied to the Daghestanian data. We claim that the important oppositions in the verbal system of the Nakh-Daghestanian languages are based on the illocutionary force and information structure of the sentences where the verbal forms occur, rather than on the dependent/independent distinction or the presence/absence of inflectional categories. Hence, the data of the Nakh-Daghestanian languages shed a new light on the definition of finiteness in terms of verb properties.

8 November

Yury Lander (HSE)

Describing narrow focus marking: a typological framework

Abstract

In this talk I discuss the diversity of constructions expressing narrow focus (also called “argument focus” by Lambrecht (1994) but including focalization of some adjuncts) and suggest a framework for describing the typology of (primarily monoclausal) narrow focus constructions. This framework, which was originally based on a scheme similar to Nichols’s (1986; 1992) “locus of marking” typology as modified in Lander & Nichols (2020), can be also considered a development of typological schemes proposed by Creissels (1978), Aannerstaad (2021) and possibly some others. At the time of writing this abstract, however, I also hope to spend some parts of the talk touching upon the problems with this approach and discussing if it may become something more than a tool for description. References Aannestad, A. A. (2021). A Typology of Morphological Argument Focus Marking. MA thesis. The University of North Dakota. Creissels, D. (1978). Réflexions au sujet de l’article de Maurice Coyaud: “Emphase, nominalisations relatives”. La linguistique, 14(Fasc. 2), 117-141. Lambrecht, K. (1996). Information structure and sentence form: Topic, focus, and the mental representations of discourse referents. Cambridge University Press. Lander, Yu., & Nichols, J. (2020). Head/dependent marking. In M. Aronoff (ed.), Oxford Research Encyclopedia of Linguistics. Nichols, J. (1986). Head-marking and dependent-marking grammar. Language, 56-119. Nichols, J. (1992). Linguistic diversity in space and time. University of Chicago Press.

Anastasia Yakovleva (HSE)

Reading Group: Martti Leiwo (2020) L2 Greek in Roman Egypt: Intense language contact in Roman military forts

Abstract

This paper will focus on analysing user-related variation in Greek inEgypt as seen through potsherd letters (ostraka) of the residents of Roman forts,praesidia, in the Eastern Desert of Egypt. The letters can be dated to the ﬁrst andsecond centuries CE. I suggest that the linguistic situation in the forts can beseen as evidence of extensive language contact that was connected with theconsiderable economic activity of the Roman Empire. All military forts hadseveral L2 Greek speakers of various ethnicity. In what follows I will suggestthat Roman soldiers and their civil partners had created a system that can bedescribed as a feature pool of Greek variables. I suggest that the data from Egyptshow that L2 speakers of Greek had an effect on Greek at all grammatical levels,strengthening existing and ongoing endogenous changes by creating sub-stantial contact-induced variation in phonology as well as in morphosy-ntax and even phraseology. The intense language contact suggests, in myopinion, that language dynamics of this period follow the resilience theory,where various different phases of the adaptive cycle can be simultaneous, asalmost all possible varieties of Greek, from historical High Attic to MultiethnicGreek are in use.

25 October

Sara Zadykian (independent researcher), Polina Artemeva (independent researcher)

The Botlikh fieldtrip of August 2022: the materials collected and the conclusions made about the semantics of -ɬːu and -ɬi spatial markers

Abstract

In this talk we plan to give a brief overview of the collected materials which include recordings of texts, an experiment and questionnaires; we also plan to present the conclusions made about the semantics of -ɬːu and -ɬi spatial markers based of the collected data.

Daria Ryzhova (HSE)

Approaching (verbal) colexifications in Andic dictionaries

Abstract

Theoretical linguistics, especially cognitive semantics, has developed many theories about types of semantic relations between different meanings of one and the same word. Until recently, these approaches were based on rather limited data, mostly coming from the so-called SAE languages. With the emergence of a large number of digitalized dictionaries and wordlists (and of the CLICS database that aggregates them), huge amounts of data on colexifications in various languages became available. One of the topical tasks for lexical semantics now is to check whether all these data fit into the existing classifications of semantic shifts. In this talk, I will discuss my first attempt to classify colexifications found in Andic dictionaries according to semantic relationship within a pair of colexified meanings.

18 October

Konstantin Zaitsev (HSE), Anzhelika Minchenko (HSE)

Automatic Detection of Borrowings in Low-Resource Languages of the Caucasus: Andic Branch

Abstract

We would like to present to you a logistic regression model that automatically detects borrowings in Andic languages. We will describe how we improved our model’s quality using feature analysis and a language model approach. Finally, we would like to discuss our study results and future research.

George Moroz

Morphological transducers: from 0 to 40 000 forms

Abstract

In this talk I will cover some aspects of morphological transducers (created with lexd and twol tools) and their application to East Caucasian languages. Nick Howell (one of the developers of lexd) and I have collaborated on this project since 2020 and several transducers were created under our supervision. In this talk I will try to cover the pipeline and possible results of this work using examples from Rutul, Agul, Chamalal, Botlikh, and several dialects of Andi.

Tanya Kazakova (HSE)

Morphological transducer for Even

Abstract

In this talk I will discuss the creation of a morphological transducer for Bystraja Even language and some problems that emerged during this process.

Daniil Ignatiev (HSE)

Exploring the limits of HFST tools

Abstract

HFST is a mature framework for natural language processing that has been around for many years. In this report, we describe how various HFST tools can be applied to Nakh-Dagestanian languages, and discuss the benefits and limitations imposed by this approach. The claims are illustrated by examples from Bagvalal.

11 October

Ilya Makarchuk

Towards a typology of “small” eventualities: on discontinuatives and verbal diminutives

Abstract

Languages of the world sometimes have verbal derivations for eventualities that are incomplete, deficient or in other way lesser than the norm. Such derivations come in very different forms. This talk is about a subset of such derivations: on what I called discontinuatives and verbal derivations exemplified by (1) and (2) respectively. On the first glance they seem very similar, but, as I show in the talk, they behave differently. I will look at behaviour of the derivations with different aspectual classes, show where these behaviours differ and propose an analysis of their semantics. (1) vašʲa uj-a suxala-kala-r-ě Vasya field-DA plow-KALA-PFV-3SG ‘Vasya plowed (the same) field with interruptions.’ (Chuvash; Tatevosov 2006) (2) Zan kontan sant-sante. Jean like.SF sing-sing.LF ‘Jean likes humming (different melodies).’ (Mauritian Creole; Henri, Winterstein 2014)

Polina Nasledskova

Reading group: Rice (2006) Ethical issues in linguistic fieldwork

Abstract

Ethical issues in linguistic fieldwork have received surprisingly little direct attention in recent years. This article reviews ethical models for fieldwork and outlines the responsibilities of linguists involved in fieldwork on endangered languages to individuals, communities, and knowledge systems, focusing on fieldwork in a North American context.

4 October

Yuri Koryakov

Jalgan-Mitagi Tat: sociolinguistics and affiliation

Abstract

In my talk, I would like to tell about the sociolinguistic situation in a small Tat community in the south of Dagestan which we visited this summer. Jalgan-Mitagi Tat appeared to be quite a living language, which is still spoken by some children. Nevertheless, language shift is in progress and now efforts can be made to stabilize the situation. I will also touch upon the place of JM Tat within the Tatic group, which includes closely-related varieties originally spoken in North-Eastern Azerbaijan and Southern Dagestan. Finally, I will discuss an attempt to use a writing system for JM Tat, against the background of other projects for the alphabetization of Tat

Timur Maisak

Reading group: Cysouw and Forker (2009)

Abstract

I suggest to read a paper by Cysouw and Forker (2009) on Tsezic languages: the authors look at the encoding of certain (nonspatial) functions by spatial cases in the modern languages and a) try to reconstruct the Proto-Tsezic encodings of these functions, b) look whether it is possible to draw a genealogical tree based on the distribution of nonspatial uses of spatial cases. Additionally, I recommend another paper by Forker (2010) which you can look into if you find the data in the 2009 paper insufficient; the 2010 paper has more examples of the usage of nonspatial cases. Our main paper for discussion will be Cysouw and Forker (2009). Both papers are attached. Cysouw, Michael and Diana Forker. “Reconstruction of morphosyntactic function: Nonspatial usage of spatial case marking in Tsezic.” Language, vol. 85 no. 3, 2009, p. 588-617. https://doi.org/10.1353/lan.0.0147 Forker, Diana. “Nonlocal uses of local cases in the Tsezic languages.” Linguistics, vol. 48, no. 5, 2010, p. 1083-1109. https://doi.org/10.1515/ling.2010.035

20 September

Alina Russkikh

Typologically oriented questionnaire for describing additive functions

Abstract

This talk presents a typologically oriented questionnaire for describing additive functions. The typological study of additives (Forker 2016) shows that there is a list of common additive functions which are connected to each other semantically and сan be represented on a semantic map. This questionnaire takes into account existing studies on additive particles, my own field experience of researching functions of additives in Turkic languages, and methodological aspects of elicitation. This particular questionnaire tests 14 additive functions: additive, scalar additive, standard concessive constructions, concessive conditionals, coordination, converbial clause, collective numerals, universal quantifiers, indefinite pronouns, mirative, contrastive topic, conjunctional adverb. In addition to a detailed discussion on additive functions and the reasons for choosing them to test in the questionnaire, I will consider ways to distinguish between functions semantically close to each other and possible problems that may be raised during elicitating stimulus with additive particles.

Ivan Netkachev (HSE University)

Multifunctional additive particles in Rutul dialects: a microtypology

Abstract

In this talk, I discuss the functions of additive particles in 12 dialects of Rutul language (< Lezgic < East Caucasian). I show that, although those dialects are generally mutually intelligible, there is a significant variation with respect to the functions that the additive particles may perform. I discuss (i) their ability to conjoin NPs («A and B»), (ii) their semantics (whether they can have scalar additive semantics or not), (iii) their ability to cooccur with other coordinating particles, (iv) their occurrence in various series of indefinite pronouns (specific vs. non-specific, free-choice) and (v) their occurrence in concessive and concessive-conditional clauses. Then I build up a microtypology based on those parameters, and sketch out the emerging theoretical generalisations.

13 September

T. Dedov, S. Verhees

A database for Arabic, Persian, and Turkic loanwords in Dagestanian languages

Abstract

In this talk we will present the DAG<APT database. The database currently contains lexemes from Dagestanian languages that have been established as borrowings from Arabic by Zabitov (2001). We intend to digitize more etymological sources in a similar manner. The goal is to create a comprehensive database of borrowings from major contact influences like Arabic, Persian, and Turkic languages into Dagestanian languages. In our talk we explain the design of the database and we discuss our plans and ideas for the future.

I. Sadakov

Update: Tsnal Lezgian Spoken Corpus

Abstract

This short talk will be an update on my work with the Tsnal Lezgian Spoken Corpus. After a brief introduction, I would like to discuss any noticed features of the variety. Some of them might be distinctive within the Jark’i dialect of Lezgian, to which the Tsnal variety presumably belongs (Mejlanova 1964). Reading Group G. Moroz Barth , D , et al. (2021). Language vs individuals in cross-linguistic corpus typology. in G. Haig, S. Schnell & F. Seifart (eds) Doing corpus-based typology with spoken language data: State of the art. University of Hawai’i Press, Honolulu, pp. 179–232.

28 June

Polina Nasledskova & Tatiana Philippova

Postpositions in East Caucasian? An areal-typological study of a category development

Abstract

In our brief talk we will report on our progress in the areal-typological study of postpositions in East Caucasian languages. We will show you the three chapters and several maps that we have contributed to the Typological Atlas of the languages of Daghestan, highlighting the key results obtained. After that, we will present our theoretical ideas on how to analyze the emerging category of postpositions in East Caucasian.

Samira Verhees

Language vitality and attitudes in Botlikh (Dagestan)

Abstract

Botlikh - a minor unwritten language of Dagestan - is evaluated by UNESCO as “definitely endangered”, which means the language is no longer passed on to children. Like many Dagestanian languages, Botlikh is under pressure from Russian as the language of socio-economic mobility. Additionally, Botlikhs have been subsumed under Avars since the 1930s as part of linguistic and ethnic planning policies of the Soviet Union, and their language is still not officially recognized. As a result, there are no resources for the language besides two academic dictionaries that are not for sale to the public. Despite all these factors that we might expect to have a negative impact on language vitality, the language seemed rather alive to me during my trips to Botlikh. I observed children of different ages speaking Botlikh at home as well as to their peers. In the village Miarso I even collected language data at the local school, where all of the children were proficient in the language. So I decided to conduct a survey among speakers of Botlikh to learn more about their language habits and how they view their own language: with whom do they speak it, do they find it important to pass it on to the next generation, and how do they see the future of the language. In the talk I will discuss the results of my survey and the method I used to collect data from Dagestan remotely.

14 June

Nikita Beklemishev (HSE University)

History of the spread of /f/ in southern Daghestan

Abstract

Among Nakh-Daghestanian languages, /f/ is found in the inventories of most Lezgic languages and Khinalugh. In early comparative works (e.g. Gigineyshvili 1977) the presence of /f/ was considered a Lezgic innovation, but, as it turns out, there might be an areal trace. I suggest that the sound was introduced to some languages through lexical borrowing, and to some others through inheritance. As a consequence, /f/ has different degrees of phonological “entrenchment”, and different patterns of distribution across lexicon. Notably, all f-languages are located in southern Daghestan and have been under strong influence from Azerbaijan. My goal is to examine the phonemic status of /f/ in the languages of southern Daghestan, to survey the ways how, and time intervals when, it might have appeared or have been introduced, and to discuss how one single process could have been applied to different languages in different ways. I will discuss several methods to evaluate the phonological entrenchment of /f/ and ways to determine the probable donor of /f/ as a borrowed phoneme, as well as complex areally-genetic generalizations that the study delivers.

7 June

Maksim Melenchenko

Dialectal variability and diachrony of numeral systems in East Caucasian languages

Abstract

According to the traditional overviews, languages of the East Caucasian family have decimal, vigesimal, or mixed numeral systems. Using data from grammars and dictionaries, I have tried to explore the variability of these numeral systems and their morphological features in detail, focusing on isoglosses between languages and their dialects. In the talk, I will present the results of this research and discuss several cases of dialectal variability of numeral systems and their possible implications for diachrony of numeral systems in the family. The found results call into question the consensual opinion that vigesimality is „native“ for East Caucasian languages and that it existed in proto-East Caucasian.

31 May

Rita Popova & Michael Daniel (HSE University)

Size matters? Testing size effects in gender assignment in four East Caucasian languages

Abstract

In this study we test for referent size effects in nominal classifications of four East Caucasian languages. It was suggested by Kibrik (1977) that in Archi, Lezgic, assignment of nouns to Gender 3 and Gender 4 shows tendencies based on referent size. The idea has been echoed in Corbett (1991), Corbett & Fedden (2018) who suggest, more specifically, that, in Archi, big entities are assigned to Gender 3. For Lak, in his discussion of the reconstruction of the «original» system of class assignment, Zhirkov (1955) proposes that historically Gender 3 was assigned to all animals, natural phenomena, round-shaped and large objects. To our knowledge, for other Lezgic languages that are genealogically related to Archi and show a similar four-way classification (i.e. have two inanimate genders in addition to feminine and masculine), no such effects have been reported, whether because they are weaker or absent altogether. The aim of this talk is to statistically test the hypothesis of referent size effects for Lak, Archi, Rutul and Tsakhur. What we want to see is whether Archi is indeed so different in this respect from Rutul and Tsakhur, its sister languages, and whether it is similar to Lak, its neighbour. We will first classify and refine the original hypotheses. We suggest that three different types of effects can in principle be expected, including absolute size effects observed in the lexicon at large (pace Corbett and Zhirkov), categorial size effects observed within specific conceptual categories (pace Kibrik) and, finally, referent size effects leading to flexible gender assignment (again, pace Kibrik; see also Di Garbo 2013, Di Garbo 2014, Di Garbo & Agbetsoamedo 2018), the latter functionally akin to diminutives and augmentatives (Grandi 2015). We then review cross-linguistic evidence of any of the three types. Next, we will discuss methods to detect such effects in a statistically meaningful way. Unlike shape or conceptual categories, size is not based on a (nearly) categorical judgment, such as ‘is.human’ or ‘is.round’, but is a relative and scalar category based on judgments like ‘is.bigger’ or ‘is.smaller’. It is not immediately clear how to manually annotate referent size or establish thresholds for entities to be judged absolutely big or small. This may be the reason why they are rarely mentioned in overviews of East Caucasian nominal classifications (Xajdakov 1980, Ivanova 2019). We decided to run several experiments about size judgments, for which we used Russian speakers in the hope that size judgments will have at least some cross-linguistic validity. After running two different experiments (and also using data from McRae et al. (2005) and Binder et al. (2016)), we collected a small database of concepts that made it possible to check for different types of size effects, not only in the four languages in the analysis, but also, in principle, for any language. To test for correlations between being small and Gender 4 and being large and Gender 3, we mapped the concepts of the database onto nominal vocabularies of the four languages. We do not observe absolute size effects in gender assignment. We do observe categorial size effects in some but not other tested conceptual categories. Referent size as reflected in flexible gender assignment has not been tested experimentally; in Archi, it seems to be more lexically limited than in the systems discussed by Di Garbo in African languages, and requires further investigation. References: Binder, J. R., Conant, L. L., Humphries, C. J., Fernandino, L., Simons, S. B., Aguilar, M., & Desai, R. H. (2016). Toward a brain-based componential semantic representation. Cognitive neuropsychology, 33(3-4), 130–174. Corbett G. (1991). Gender. Cambridge: Cambridge University Press, 1991. Di Garbo, F. (2013). Evaluative morphology and noun classification: A cross-linguistic study of Africa. Skase Journal of Theoretical Linguistics, 10(1), 114–136. Di Garbo, F. (2014). Gender and its interaction with number and evaluative morphology: An intra-and intergenealogical typological survey of Africa (Doctoral dissertation).Department of Linguistics, Stockholm University. Di Garbo, F., & Agbetsoamedo, Y. (2018). Non-canonical gender in African languages: A typological survey of interactions between gender and number, and between gender and evaluative morphology. In S. Fedden, J. Audring, & G. G. Corbett (Eds.), Non-canonical gender systems Oxford University Press. Fedden, S., & Corbett, G. G. (2018). Extreme classification. Cognitive Linguistics, 29(4). Grandi, N. (2015). Edinburgh Handbook of Evaluative Morphology. Edinburgh University Press. Ivanova, V. (2019). Korreljacija mezhdu imennym klassom i semantikoj i fonetikoj suschestvitel’nogo v nakhsko-dagestanskikh jazykakh [Correlation between the noun class and semantics and phonetics of the noun in the Nakh languages]. Kibrik, A., Olovjannikova, I., & Samedov, D. (1977). Opyt strukturnogo opisanija archinskogo jazyka [Structural description of Archi] (Vol. 1). Izd-vo Moskovskogo universiteta. McRae, K., Cree, G. S., Seidenberg, M. S., & McNorgan, C. (2005). Semantic feature production norms for a large set of living and nonliving things. Behavior research methods, 37(4), 547–559. Xajdakov, S.M. Principy imennoj klassifikacii v dagestanskih yazykah. M.: Izdatel’stvo Nauka, 1980. Zhirkov, L. (1955). Lakskij jazyk [Lak Language]. Fonetika i morfologija. M.: Izd-vo AN SSSR

24 May

Gasangusen Sulaibanov (École Pratique des Hautes Études - PSL, Paris, France)

Сложные глаголы с идеофонами в диалекте даргинского языка с. Цугни

Abstract

В цугнинском диалекте даргинского языка в ходе анализа было выявлено более ста лексических единиц, которые можно было рассматривать как идеофоны. Эти лексические единицы используются с вспомогательными глаголами и образуют подкласс ковербов. Количество глаголов, используемых в ковербиальных конструкциях с идеофонами, ограничено примерно десятью глаголами. Но только некоторые из этих глаголов имеют чрезвычайно продуктивное использование. В докладе будут представлены классификация идиофонов а также особенности взаимодействий идиофонов с различными вспомогательными глаголами.

17 May

Timur Maisak

Morphological marking of “meditative” questions in Nakh-Daghestanian languages

Abstract

In the talk I will present the results of a pilot study of “meditative” questions, a special semantic type of non-canonical questions, which normally do not require an answer and can even be asked in the absence of an addressee (cf. ‘I wonder’-questions in English or ‘интересно’-questions in Russian). In a number of Nakh-Daghestanian languages, questions of this type have dedicated morphological marking (suffixes or enclitics), although there seems to be no systematic study of their marking types. I will look at the marking of meditative questions in comparison with the marking of ordinary (polar and content) and indirect questions in several languages of the family. I will also briefly discuss the typical contexts where meditative questions are found in texts.

26 April

Светлана Амосова (Еврейский музей и центр толерантности, ИСл РАН)Михаил Васильев (Центра «Сэфер», ИЯ РАН)

Евреи Дагестана: история, современное состояние этнической группы и памятники материального наследия

Abstract

В первой части доклада речь пойдет об этнической группе Дагестана, которую называли и называют по-разному: горские евреи, таты, евреи Дагестана, джуури и Кавкази. Мы расскажем, что означают все эти термины, откуда они появились, рассмотрим разные точки зрения на этногенез этой группы, поговорим о территории проживания и диалектах языка, как складывались в разное время отношения с другими этническими группами Северного Кавказа. Кроме того, на материалах экспедиций последних нескольких лет мы покажем особенности современной идентичности этой группы, как она менялась на протяжении XX в. Во второй части мы познакомимся с памятниками материального наследия горских евреев в Южном Дагестане, которые представлены главным образом сохранившимися зданиями синагог, а также горско-еврейскими кладбищами XVII – XX вв. При этом мы покажем, как при недостатке других письменных свидетельств надгробная эпиграфика становится одним из важнейших источников сведений о географии расселения и локальной истории небольших еврейских общин, проживавших в удалённых районах Южного Дагестана и прекративших существование в начале XX в. В заключении мы на примере экспедиций Центра «Сэфер», проводившихся в 2018 – 2020 гг., кратко расскажем об особенностях и сложившейся практике исследований по изучению традиционной и современной культуры, а также наследия горских евреев как в регионах традиционного проживания, так и в диаспоре.

19 April

Matthew Carter (University of California, San Diego)

Polyfunctional Argument Markers in Ket: Implicative Structure within the Word

Abstract

Ket is the last of the indigenous Yeniseian languages of central Siberia. Ket indexes both subjects and direct objects on the verb, but the way in which this is done varies significantly from one lexeme to another, forming a fairly complex system of inflectional classes (Nefedov & Vajda 2015). There is substantial reuse of material across different classes, such that the same marker may be the sole marker of the subject (1), or a co-exponent of the subject with another marker (2), or an object marker (3), depending on the verb. Furthermore, several argument markers represent a fusion or reanalysis of historically distinct markers, and alternatively or simultaneously encode completely orthogonal functions, (marking tense, cf. 4 and 5), or serve no obvious function. This situation, wherein same marker systematically encodes different functions across different lexemes, is known as Polyfunctionality (Stump 2015). It represents a type of complexity of exponence (Anderson, 2015), a phenomenon wherein there is a non-isomorphic or otherwise opaque relationship between units of meaning (e.g. tense, person) and the formal units which are used encode them (e.g. affixes, stem alternations). Polyfunctionality of the type seen in Ket would seem to present a communicative challenge in decoding; If the same marker can encode many different functions across different lexemes within the same subsystem of the morphology, on an arbitrary basis, how does a Ket listener understand which of the possible functions is intended in the given instance? This problem is made more acute by the fact that the language is pro-drop and exclusively head marking with regard to core syntactic arguments (Kotorova & Nefedov 2016). The potential communicative challenges presented by complex form~meaning mappings has been a major focus of much recent work in morphological complexity (Ackerman et al. 2009, Ackerman & Malouf 2015, Sims & Parker 2016). Such work has largely focused on the question of how speakers of morphologically complex languages predict forms which they have never directly encountered (the so-called Paradigm Cell Filling Problem). If a language can encode the same information in many different ways (via affix allomorphy, stem changes etc.), how does a speaker know how to encode the information in any given form, provided that they have never encountered that form before? As a solution, such work implicates the property of inflectional paradigms known as implicative structure (Wurzel 1984). The morphology of a language exhibits implicative structure if known forms of a lexeme provide clues to unknown forms, such that all cells in a paradigm can be predicted from some subset of these cells. By hypothesis, form~meaning mappings in a language may be complex, provided that the necessary form can be predicted in any given instance (the Low Conditional Entropy Conjecture). However, this work has largely not focused on the role that implicative structure may play in decoding, as opposed to encoding, complex form~function mappings, nor on the role of implicative structure in syntagmatic, as opposed to paradigmatic, structure. Using data drawn from both published sources and original fieldwork, this paper demonstrates that in Ket, although individual argument markers are often highly polyfunctional, they are organized into networks of implicative relations which greatly reduce uncertainty with regard to their function in any particular instance. In other words, the range of possible functions for a particular argument marker can be greatly reduced by observing which other argument markers are present or absent in the same wordform, and which features those encode (cf. 6 and the dependency graph in 7). In this way, uncertainty with regard to the function of the wordform can be kept low, even without reference to the syntactic context or knowledge of paradigmatically related forms. As a case study, Ket is suggestive of a sort of “Low Conditional Entropy Conjecture in decoding”, wherein individual markers may be highly polyfunctional, provided that their functions can be determined in any given instance. The role of syntagmatic implicative structure in achieving this in Ket underscores the point made by Sims and Parker (2016) that the amount of work done by implicative structure is a point of cross-linguistic variation. It also makes predictions for other head-marking languages with very high complexity of exponence.

12 April

Evgeniya Korovina (Institute of Linguistics, RAS)

Borrowings and contacts in basic vocabulary and classification

Abstract

Despite the fact that, by definition, the basic vocabulary consists of words that are borrowed least often, borrowings in this part of the lexicon happen regularly. This ranges from highly visible loanwords from languages of other families, such as Spanish borrowings in the languages of the indigenous population of Latin America, as well as hard-to-find loanwords and structural parallelism (homoplasy) between languages within the same subgroup. Cases of the second kind are especially typical for the so-called dialect chains, where it is sometimes difficult to draw a line between idioms, as well as in situations of significant phonetic conservatism of languages. Using examples from the history of, first of all, the languages of Central America and Polynesia, in my talk I’ll try to consider ways to formally mathematically identify such situations.

29 March

Рамазан Абдулмажидов (ИИАЭ ДНЦ РАН), Шахбан Хапизов (ИИАЭ ДНЦ РАН)

Тысячелетняя история письменности на языках народов Дагестана: взгляд сквозь призму веков

Abstract

Дагестан представляет собой регион с удивительным этническим и культурным многообразием, который единственный на Северном Кавказе имеет многовековую историю письменности. Еще в советский период был зафиксирован факт создания уникальной письменности Кавказской Албании, генетически связанной с армянским и грузинским письмом. Это государство, как известно, простирало свои границы на большей части современного Дагестана. Подлинный прогресс в исследовании албанской письменности связан с выявлением в 1990-х гг. в монастыре на Синайском полуострове 2 палимпсестов предположительно VII в., написанных на «агванском» языке. Только после их исследования удалось окончательно установить его место среди восточнокавказских языков. Второй по времени опыт письменной фиксации речи на восточнокавказских языках (в данном случае на аварском языке) связан с распространением в Дагестане православия и грузинской письменности. Деятельность миссионеров здесь сопровождалась подготовкой и обучением служителей церкви из числа местного населения, составлением текстов на грузинском языке. Начало профессиональному исследованию грузинографической эпиграфики Дагестана было положено в первой половине ХХ в. Ну и следующий этап развития письменности в Дагестане был связан с его исламизацией, и последовавшей за ней экспансией мусульманской культуры. Письменность на арабской графике для записи текстов на восточнокавказских языках начала использоваться еще в средневековый период, хотя вряд ли этот опыт имел системный характер. Из числа зафиксированных и ныне сохранившихся памятников наиболее ранней является аварская надпись XIV в. на камне, вставленном в стену мечети сел. Корода Гунибского района Республики Дагестан. Все эти этапы и процессы развития письменности в Дагестане будут в развернутом виде освещены в настоящем докладе.

22 March

Timofey Dedov (HSE University)

Days after tomorrow in the languages of Daghestan

Abstract

In the languages of Daghestan, days after tomorrow can be encoded in different ways. Three different types of strategies can be distinguished: 1) using semi-compositional terms with similar suffixes; 2) using transparently compositional constructions (like in most languages); 3) and using non-derived terms, which seems to be rare cross-linguistically. Some East Caucasian languages also differ from most other languages in the number of unique terms that are used to refer to the days after tomorrow (for the third strategy, the amount of unique terms for consecutive days after tomorrow can be as high as six). In my talk I will discuss all three strategies in more detail, and introduce the geographical distribution of these strategies, which was investigated for the “Typological Atlas of the languages of Daghestan”.

Katerina Dagkou (University of Groningen)

Systems of grammatical cases in the languages of Daghestan

Abstract

The languages of Daghestan vary in terms of the system of core (grammatical) cases they feature. A first distinction concerns ergative vs. accusative languages. For ergative languages the typical system of core cases includes absolutive, ergative, genitive, and dative. Accusative languages include nominative, genitive, accusative, and dative. Some languages also include other grammatical cases besides the basic ones, e.g., affective, comitative, instrumental, comparative, and ablative. More exotic cases like contentive and benefactive are reported for a couple of languages. In this talk, I discuss the classification and distribution of systems of grammatical cases in the languages of Daghestan, which is the result of my research within the TALD (Typological Atlas of the Languages of Daghestan) project. Apart from classifying the languages according to the type and number of cases they include, I will also present the morphology of the grammatical cases per language group, their syntactic functions, and instances of case syncretism.

15 March

Katherine Hodgson (University of Cambridge)

Zok, the Armenian dialect of Agulis

Abstract

Zok (otherwise known as the Agulis dialect of Armenian) is a form of Armenian that is so divergent that it has been described by some as a separate language. It was spoken in and around the town of Agulis in the southern part of Nakhijevan. The first written record of Zok is from 1711, but there were Armenians living in this area from at least the 5th century AD. The Armenian presence in Agulis itself ended with the massacre of 1919, but dialect-speaking populations remained in some of the surrounding villages, notably Tsghna, Tanakert, Ramis, and Paraka, until the 1970s and 80s. Closely related dialects are still spoken today in a few villages just across the border in the area of Meghri in Armenia. However, with the exception of Karchevan (population 292), these villages are now virtually abandoned, and the language is not being passed on to the younger generation. It is the subject of a documentation project funded by the Endangered Languages Documentation Programme. Speakers from the villages Tsghna, Tanakert, Ramis, and Paraka in Nakhijevan, and Karchevan and Kuris in the area of Meghri have produced 27 hours of video, of which 4 hours have so far been annotated using ELAN and FLEx. Zok is not intelligible to speakers of other forms of Armenian, and various claims have been made about the origin of the speakers. However, a closer linguistic examination reveals that many of its distinctive features are shared wholly or partly with neighbouring Armenian dialects, especially those of Karabagh and northern Iran, implying that the Zoks are a local Armenian population with a long-term, stable presence in the area. The existence of geographically-correlated dialect variation between the villages where the language is spoken (the closer together they are, the more features they have in common) also suggests a stable pattern of settlement. Apart from phonological features (vowel shift, vowel harmony), the most striking distinctive characteristic is the development of a verb system that is unique within Armenian. This involves the loss of all monolectic verb forms in the indicative mood, and their replacement with participles + auxiliary, a tendency that exists in Eastern Armenian in general, but has nowhere else reached this extent. This is accompanied by the shift of tense marking from the auxiliary, which has become essentially a person marker, to a particle added to the ‘present’ (unmarked) form, something which is also found in Khoy/Urmia dialect. The past subjunctive is also formed in this way. Both these processes, as well as the mobility of the auxiliary/person marker, which attaches to the element with the main sentential stress, are characteristic of languages of the Iran-Araxes area.

1 March

George Starostin (Centre for Comparative Studies and Phylogenetics of the Institute for Oriental and Classical Studies, HSE / External Fellow, Santa Fe Institute)

The role of proto-wordlists in modern historical-comparative studies: from phonetic and semantic to “onomasiological” reconstruction

Abstract

Although lexicostatistical methods of estimating linguistic distance between related or potentially related language units have become an essential staple of modern day phylogenetic linguistics, their reliability often depends more on the accurate collection and curation of data than on the specific mathematical / computational methods applied to said data. In my talk, I shall try to delineate the theoretical and pragmatic importance of a relatively new methodology, dubbed “onomasiological reconstruction”, which purports to introduce a new level of accuracy to the projection of lexical items onto proto-levels of varying time depth. This methodology, which requires paying equal attention to phonological, semantic, and distributional features of compared items, can then be combined with lexicostatistical methods and applied with equal efficiency to varying datasets and linguistic taxa of widely varying time depth. In addition to having already yielded efficient results across genetic lineages ranging from Indo-European to North Caucasian to African language families, onomasiological reconstruction seems to hold plenty of potential for successfully differentiating between “patently false” and genuinely promising hypotheses of distant linguistic relationship.

22 February

Matthias Urban (University of Tübingen)

Typological patterns and the language dynamics of the ancient Central Andes and South America

Abstract

In this presentation, I will sketch different aspects of the language dynamics of the ancient Central Andes of Peru and Bolivia –one of the few “cradles of civilization” of humanity – and South America more generally. I will highlight in particular the role of linguistic interaction and contact and the resulting typological distributions in understanding this dynamics. I will start out from the present-day linguistic landscape of the Central Andes, which is strongly dominated by the Quechuan and Aymaran families whose common contact-induced typological profile has for a long time influenced ideas of what Andean languages are like. I will then broaden the scope and explore how new analyses of the available materials for the now extinct languages of the Central Andes bring to light a now submerged interaction sphere in Northern Peru, and how this north-south structure is congruent with archaeological and molecular anthropological evidence, allowing for new ways of interdisciplinary dialogue beyond language expansions. Finally, I will broaden the scope again, and show how recent work in the areal typology of broader parts of the Andes and South America articulates with these new findings. This work suggests a finely spatially structured gradient of typological variation in the Andes into which the new evidence from the Central Andes fits seamlessly. The proper interpretation of this gradient is presently not clear yet, though one possibility is that it is a reflection of an ancient layer of affinities between the languages of the region.

1 February

Pavel Astafiev, Nikita Beklemishev, Nina Dobrushina & Alina Russkikh (in alphabetical order)

Looking for areal patterns in the domain of discourse formulae: The case of blessings and curses in Daghestan

Abstract

It is a well-known fact that certain discourse markers, such as interjections, formulae of greetings or leave-taking, vocatives or politeness markers are often borrowed (see Andersen 2014 for some references). The claim is primarily derived from the data on material borrowings, such as English OK or Russian davaj, but there is also scarce evidence of pattern borrowing in this domain. Studies mention the similarity of greetings in some areas (Matisoff 2011 in South-East Asia, Lüpke & Watson 2020 in West Africa) or good-night expressions in contacting languages (May God wake us up in Ewe and Likpe - Ameka 2006). There are few systematic studies of the spread of discourse patterns across certain areas, such as word iteration in the Mediterranean (Stolz 2004) or morning greetings in Daghestan (Naccarato & Verhees 2021). Areal comparison of some types of formulae can also be investigated in anthropology; for example, some formulae are included in the world-wide database of folklore and mythological motifs (Berezkin & Duvakin, http://www.ruthenia.ru/folklore/berezkin/). Many questions remain unanswered: are discourse formulae more diffusible than grammar? Do the areal distributions of formulae correspond to the areal distributions of linguistic features of other levels? How strong is the genealogical signal in the distribution of discourse formulae? What is their role in the transfer of various grammatical phenomena? In this talk, we will approach this issue from the perspective of wish-expressions, or blessings and curses in Daghestan. We will present the database of wish-expressions in nine languages of Daghestan. One of the problems with cross-linguistic comparison of blessing and curses is that it is not fully clear what are the grounds for such comparison, i.e. which wishes of one language should be mapped on which wishes of other languages. We will discuss the problem of cross-linguistics comparison of wishes and our first attempts to process the sets of formulae in these nine languages in order to detect the areal signal. References Ameka, Felix K. 2006. Grammars in contact in the Volta Basin (West Africa): On contact induced grammatical change in Likpe. In Alexandra Y. Aikhenvald & R. M. W. Dixon (eds.), Grammars in contact: A crosslinguistic typology, 114–142. Oxford: Oxford University Press. Andersen, G. (2014). Pragmatic borrowing. Journal of Pragmatics, 67, 17-33. Lüpke, Friederike & Rachel Watson. 2020. Language contact in West Africa. In: Adamou, Evangelia & Yaron Matras (eds.): The Routledge handbook of language contact. Matisoff, James A. 2011. Areal semantics - Is there such a thing? In: Saxena, A. ed. Himalayan languages: past and present (Vol. 149). Walter de Gruyter. Naccarato, Chiara & Samira, Verhees. 2021. Dobroe utro, prosnulis’? Utrennie privetstvija v jazykah Dagestana. In Durh#asi hazna. Sbornik statej k 60-letiju r. O. Mutalova, edited by Nina R. & Testelec Majsak Timur A. & Sumbatova. Moskva: Buki Vedi.

25 January

Chiara Naccarato, Ezequiel Koile, Michael Daniel, Nina Dobrushina, Samira Verhees (Linguistic Convergence Laboratory) Aleksey Vinyar, Alexandra Nogina, Daria Ignatenko, Tatiana Kazakova, Alexey Baklanov, Ksenia Lapshina (Arctic Lab)

Detecting regional areal patterning across multiple linguistic features. A discussion

Abstract

Systematic study of areal convergence is fed by comparable data on typological profiles of languages belonging to the area under consideration. But when envisaging an analysis of areal patterning of languages within a certain area, an expert in this area runs a risk of eschewing the results of her analysis by (unconsciously) selecting linguistic features known or more readily available to her to consider, which may shape the area in a specific way; while choosing a different set of features for data collection may result in another partitioning of the same area. In this seminar, we are going to discuss this in connection with the ongoing projects of areal typological study of the languages of Russia. In the first part, we will provide a brief recap of the Typological Atlas of the Languages of Daghestan, a project in which data on typological diversity of Daghestanian languages and their relatives and neighbors is being systematized, surveying a general approach to data collection and examples of features. In the second part, our colleagues from the Arctic Lab will present similar data from their research on linguistic diversity of the languages of Northeast Siberia (from Samoyedic branch of Uralic in the west to Turkic, Mongolian and Tungusic to Yukaghir, Nivkh, Chukotko-Kamchatkan and Aleut-Yupik-Inuit in the east). Finally, in a third part, we will discuss methodological issues and current approaches aiming to define linguistic areas with a global perspective. The seminar will be in a slightly unusual format, probably more interactive than usual and primarily intended to share experience and ideas, and with contributions from different research groups. We also invite comments from other participants that embark on comparable enterprises and encounter similar challenges.

18 January

Maria Khachaturyan (University of Helsinki)

Language contact between Mano and Kpelle: a holistic research program

Abstract

This talk presents an ongoing project on multilingualism and language contact between Mano and Kpelle, two Mande languages spoken in the South-East of Guinea. In the first part of the talk, I provide an overview of the project and its different strands, including 1) an investigation of the sociolinguistic situation of the region with a particular focus on strategies of language choice studied with ethnographic observations and with a sociolinguistic questionnaire (Khachaturyan and Konoshenko 2021); 2) a comparative study of the grammars of these two languages and their close linguistic relatives and identification of convergent and divergent features, including those potentially related to pattern (Konoshenko 2015: 176‑177) and matter borrowing (Khachaturyan 2019); 3) a study of translation, and especially religious translation, translatory artefacts and practice of translation, as a locus of contact and source of convergence (Khachaturyan 2020, Khachaturyan and Konoshenko in prep.). In the second part of the talk, I present an experimental study focusing on the acquisition of a particular morphosyntactic parameter of Mano, namely, reflexive marking, and the impact of the speakers’ exposure to Kpelle on the acquisition process. I present the experimental design, preliminary results and theoretical questions which the study aims to address.

11 January

Aigul Zakirova

Adjectival agreement in the East Caucasian languages: an overview

Abstract

Few sources deal with the origin of number agreement in the languages of the world. Apart from the theoretical work (Lehmann 1982), only several case studies have been published, among them (Frajzyngier 1997, Di Garbo 2020, Cruz 2015). The EC languages are numerous, and adjectival number agreement in EC seems to be morphologically and diachronically heterogeneous, which leads one to believe that it is innovative. This makes the EC languages suitable for investigating the origin of number agreement. However, no study of this kind has been undertaken yet. The goal of this study is to make a survey of number agreement patterns in EC, to assess the weight of genealogical and areal factors in the distribution of patterns and then try to describe paths by which adjectival plural agreement may have originated in the EC languages. I use the methodology adopted in the Typological Atlas of the Languages of Dagestan: So far, I searched in grammars of the EC languages and the neighboring languages (overall 65 idioms) to find out how adjectival plural agreement is expressed in each of them. I divided the languages into three types: obligatory / optional / absent plural agreement. For the optional type I established the factors that influence the presence of number agreement and plotted these types on maps.

Seminar schedule 2021

21 December

Polina Nasledskova & Tatiana Philippova

Postpositions in East-Caucasian languages: a description and comparative perspective

Abstract

Our study summarizes and analyzes the information about postpositions provided in the various grammatical descriptions of East-Caucasian languages. In our talk, we are going to briefly report on our findings and discuss the prospects. First, we are going to propose an overview chapter on postpositions and several features for maps in the Typological atlas of the languages of Daghestan (TALD) that we are currently working on. Second, we are going to present our conception of a paper about East-Caucasian postpositions from a typological perspective. In particular, we are going to show that despite the fact that a number of properties of East-Caucasian postpositions differ significantly from the typical properties of adpositions in general and Indo-European prepositions in particular, the difference is not due to their special status, but due to the fact that they usually do not serve the functions of primary adpositions in other languages. Rather, East-Caucasian postpositions are more similar to the secondary adpositions in other languages, while typical primary adpositions more often correspond to the East-Caucasian spatial suffixes rather than to postpositions. Finally, we are going to suggest that the notions of localization and directionality, widely used for the description of spatial forms in East-Caucasian languages, can help better describe the meanings and functions of primary adpositions in other languages (e.g. Russian), if applied as comparative concepts.

14 December

Ezequiel Koile & George Moroz

Detecting linguistic variation with geographic sampling

Abstract

Geolectal variation is often present in settings where one language is spoken across a vast geographic area. This can be found in phonological, morphosyntactic, and lexical features. For practical reasons, it is not always possible to collect fieldwork data from every single location in order to obtain this full pattern of variation, and we must select a group of locations to be surveyed, in order to resemble the underlying distribution of linguistic features. We propose and test a method for sampling different locations where a language is spoken, finding the optimal places to be included in a sample, with the goal of obtaining a distribution of typological features representative of the whole area. For this goal, we use different clustering algorithms such as k-means and hierarchical clustering of locations based on their geographic distribution, and define our sample of locations on the basis of this clusterization. We test our methods against simulated data with different distributions of linguistic features, on various spatial configurations, and also against real data from Circassian dialects (Northwest Caucasian). Our results show an efficiency higher than random sampling, both for detecting variation and for estimating its magnitude, which makes our method profitable to fieldworkers when designing their research.

7 December

A.Bonch-Osmolovskaya, E.Klyachko, S. Kosyak, L.Nesterenko, G.Moroz, O.Serikov, S.Toldova

Field NLP and where to find it in the School of linguistics

Abstract

In our talk we are going to present a new research area, which we call Field NLP — a mixture of several areas: application of Natural language processing methods to low resourced languages; creation of tools for field linguists; getting low resourced languages’ data from non-field: digitalisation, social media parsing etc; digital preservation of low resourced languages’ data; popularisation of low resourced languages’ data among speakers. Some of these domains are more developed and have more prominent results than others. We are going to highlight existing lacunae and make an overview of known tools for Field NLP and present our research in this field. We will cover automatic transliteration, segmentation, speech recognition, morphological glossing and others. We believe that the only way to advance the Field NLP is working as a community. Hence we aim to find common grounds with all scholars engaged in studies and documentation of minor languages. With this in mind, we will specifically address the problem of digital preservation of linguistic data.

30 November

Dmitry Nikolaev (University of Stuttgart)

Studying language contact using neighbour graphs: From consonant-inventory prediction to analysis of segment borrowability

Abstract

The aim of this talk is to demonstrate the advantages of using geographical nearest-neighbour graphs for large-scale study of language contact. After discussing the motivation for using nearest-neighbour graphs in typological linguistics and briefly surveying the ways of constructing them, I will present two case studies. In the first one, I will show that a nearest-neighbour graph gives a flexible and efficient way of showing the importance of language contact for modelling the composition of segmental inventories of Eurasian languages; I will argue that at a certain time depth language contact becomes a better predictor of consonant-inventory structures than phylogenetics. In the second case study, I will show how SegBo, a recently presented dataset of borrowed phonemes, can be used to construct a world-wide graph of language contact and then will use this graph to model comparative borrowability of different phonemes.

23 November

Валентин Гусев (Институт языкознания РАН)

К реконструкции древних контактов языков Северной Сибири

Abstract

Языки Северной Сибири обладают рядом интересных, в том числе типологически нетривиальных ареальных особенностей. В докладе будут рассмотрены некоторые из этих особенностей, будет показано, в какие кластеры на их основании можно объединить языки (а таких группировок может быть несколько в зависимости от того, какие черты мы рассматриваем) и какие из них обнаруживают неожиданные географические параллели. По крайней мере, некоторые из этих параллелей с большой вероятностью свидетельствуют о древних контактах.

16 November

Anastasia Panova

Chaplinsky and other Yupik languages of Chukotka: sociolinguistic situation and a case study in grammar

Abstract

In this talk I am going to present the results of my fieldtrip to Chukotka in October 2021. First, I will show a small corpus of narratives and songs in Yupik languages which I have collected during my fieldwork. Second, I will present sociolinguistic data: following (Dobrushina 2013), I used the method of retrospective family interviews and gathered some first-hand data on language repertoires of Yupik people (namely, their knowledge of other Yupik languages, Chukchi, Russian and English), the history of Yupik-Chukchi relations and the history of relations between speakers of Chaplinsky (Central Siberian) Yupik in Novoje Chaplino and on St. Lawrence Island (USA) (cf. Morgounova 2007). Third, I will describe constructions with wordforms containing suffixes -st caus, -sq ask, -nəχsiʁ expect, -niq say in Chaplinsky Yupik. As has been previously noted for the reportative suffix -niq say (Vakhtin 2007: 109-115), wordforms with ‑niq can be analyzed as consisting of two predicates: the matrix predicate ‘say’ and the dependent predicate. I develop this analysis and argue that constructions with all four listed suffixes represent examples of morphologically bound complementation (Maisak 2016, Panova 2020). Dobrushina N. (2013). How to study multilingualism of the past: Investigating traditional contact situations in Daghestan. Journal of Sociolinguistics, 17 (3). P. 376-393. Maisak T. A. (2016). Morphological fusion without syntactic fusion: The case of the “verificative” in Agul. Linguistics, 54(4). P. 815–870. Morgounova, D. (2007). Language, identities and ideologies of the past and present Chukotka. Études/Inuit/Studies, 31 (1-2). P. 183-200. Panova, A. B. (2020). Morfologicheski svyazannaya komplementatsiya v abazinskom yazyke. Voprosy Jazykoznanija, 4. P. 87–114. Vakhtin, N. B. (2007). Morfologiya glagol’nogo slovoizmeneniya v yupikskikh (eskimosskikh) yazykakh. S.-Petersburg: Nestor.

9 November

George Moroz

Comparing cross-language phonological profiles

Abstract

This talk considers different strategies for comparing the phonological profiles of languages. This can be useful for comparing different related lects (dialectology), unrelated lects (phonological typology), different diachronic states of the same lects (historical linguistics), models for language acquisition/loss, some NLP tasks, etc. I discuss two different strategies for comparing phonological profiles: the complexity-based approach and the distance-based approach. In the first approach, researchers propose different ways of calculating phonological complexity (Nichols 2009; Maddieson 2009; Coupé et al. 2009), which can be used in cross-language comparison (see criticism of this approach in (Simpson 1999; Deutscher 2009; Ohala 2009)). In the second approach, scholars apply different measures for calculating the distance between languages based on phonology (Heeringa 2004; Eden 2018; Anderson et al. 2021). There are two methods used in the distance measurement literature: * parametric approach: different feature sets (segment inventory, feature inventory, typological phonological features like stress and syllable structure) are used for distance calculation; * cross-entropy approach: entropy is used for the analysis of some samples of language data (corpus, dictionary). Anderson, C., Tresoldi, T., Greenhill, S. J., Forkel, R., Gray, R. D., and List, J.-M. (2021). Measuring variation in phoneme inventories (preprint v1). Research Square . Coupé, C., Marsico, E., and Pellegrino, F. (2009). Structural complexity of phonological systems. In Approaches to phonological complexity, pages 141–170. De Gruyter Mouton. Deutscher, G. (2009). “Overall complexity”: a wild goose chase? In Language complexity as an evolving variable, pages 243–252. Oxford University Press. Eden, S. E. (2018). Measuring phonological distance between languages . PhD thesis, University College London. Heeringa, W. J. (2004). Measuring dialect pronunciation differences using Levenshtein distance . PhD thesis, University Library Groningen. Maddieson, I. (2009). Calculating phonological complexity. In Approaches to phonological complexity, pages 83–110. De Gruyter Mouton. Nichols, J. (2009). Linguistic complexity: a comprehensive definition and survey. In Language complexity as an evolving variable, pages 110–125. Oxford University Press. Ohala, J. J. (2009). Languages’ sound inventories: the devil in the details. In Approaches to phonological complexity, pages 47–58. De Gruyter Mouton. Simpson, A. P. (1999). Fundamental problems in comparative phonetics and phonology: does UPSID help to solve them. In Proceedings of the 14th international congress of phonetic sciences, volume 1, pages 349–352.

26 October

Susanne Michaelis (Max Planck Institute for Evolutionary Anthropology)

Avoiding bias in comparative creole studies: Stratification by lexifier and substrate

Abstract

One major research question in creole studies has been whether the social/diachronic circumstances of the creolizaton processes are unique, and if so, whether this uniqueness of the evolution of creoles also leads to unique structural changes, which are reflected in a unique structural profile. Some creolists have claimed that indeed the answer to both questions is yes, e.g. Bickerton (1981), McWhorter (2001), and more recently Peter Bakker and Ayméric Daval-Markussen. But these authors have generally overlooked that cross-creole generalizations require representative sampling, especially when working quantitatively. Sampling for genealogical and areal control has been a much discussed topic within world-wide typology, but not yet in comparative creolistics. In all available comparative creoles studies, European-based Atlantic creoles are strongly overrepresented, so that typical features of these languages are taken as “pan-creole” features, e.g. serial verbs, double-object constructions, or obligatory use of overt pronominal subjects. But many of these Atlantic creoles have the same genealogical/areal profile, i.e. European (lexifier) + Macro-Sudan (substrate). I therefore propose a new sampling method that controls for genealogical/areal relatedness of both the substrate and the lexifier, which I call “bi-clan” control (where “clan” is a cover term for linguistic families and convergence areas).

19 October

Natalya Stoynova

Assessing inter-speaker variation in contact-influenced Russian

Abstract

In this talk, I will deal with Russian speech of older speakers of Nanai and Ulcha (Southern Tungusic, the Amur region). A great inter-speaker variation takes place: some bilingual Nanais and Ulchas are speakers of a “near-pidgin” Russian variety, the speech of some others does not differ greatly from the monolingual benchmark. The data used in the study come from the Corpus of contact-influenced Russian of Northern Siberia and the Russian Far East (http://web-corpora.net/ruscontact/corpus.html). This is a small spoken corpus provided with a manual annotation of contact-induced grammatical features (non-standard agreement, non-standard argument encoding etc.). Based on this annotation, I will try to assess the inter-speaker variation attested in the corpus. On the one hand, I will show which contact-induced features appear to be more stable, i.e. equally represented in texts produced by different speakers, and which ones contribute to inter-speaker variation most of all which features behave similarly, i.e. are equally frequent / infrequent in texts produced by the same speakers. On the other hand, I will discuss how speakers group together according to contact-induced features typical of them whether these clusters of speakers correlate with any sociolinguistic parameters whether they go in line with the researcher’s intuition or look surprising. An additional motivation for this study is methodological. I will test how precisely the existing corpus annotation captures the degree of deviation from monolingual benchmark and inter-speaker variation.

5 October

Daniil Ignatiev, Nick Howell, George Moroz

Computational processing of Bagvalal morphology: problems and future tasks

Abstract

Bagvalal is a minority language of the Nakh-Daghestanian language family. Like many indigenous languages, Bagvalal lacks tools for computational processing of language data. While field researchers have accumulated a relatively large amount of linguistic data in documentation projects, it is still insufficient for statistical approaches to text processing to be applied. The talk discusses a rule-based technology for text processing that was successfully used to design a prototype morphological glosser for the Kwanada dialect of Bagvalal. Lack or insufficiency of certain types of lexical and grammatical data, to be discussed in the talk, complicates further tuning of the instrument as well as its application to other Bagvalal dialects. However, further work on the analyzer could facilitate fieldwork and make it possible to design a machine translation system for Bagvalal.

28 September

Maxim Melenchenko & Aigul Zakirova

Several aspects of numeral morphology in the languages of Dagestan

Abstract

In this talk we will demonstrate new maps for the Typological Atlas of the languages of Dagestan, covering several topics of numeral morphology in the East Caucasian languages. We examine numeral markers appearing in different series (cardinals, ordinals, distributives, etc) and elaborate on their diachronic sources. We also address differences in the structure of complex numerals, e.g. the inventories of linking suffixes and the repetition of cardinal markers inside complex numerals. Finally, we will discuss several instances of borrowing in numeral systems, including lexical and morphological borrowings.

21 September

Pierpaolo di Carlo, Jeff Good (University of Buffalo)

Exploring socio-spatial networks and individual-based variation in the study of small-scale multilingualism

Abstract

This talk presents the initial results of research by a number of members of the KPAAM-CAM multidisciplinary team (including linguists, sociolinguists, anthropologists, and geographers) aiming to explore multiple methods and datasets in the study of small-scale multilingualism. The testbed is Lower Fungom, a rural area of western Cameroon where small-scale multilingualism has been widely documented. In the talk, we will present (i) epistemological issues posed by contexts of small-scale multilingualism and the methodological responses we have put in place to address them, mainly concerning the need to explore individual-based variation; (ii) initial findings from the study of individual-based wordlists by applying tools originally designed for cognate detection for historical linguistic purposes to questions of synchronic variation, and (iii) the correlations that such lexicostatistical data have with geographic distance vs. travel difficulty between locales associated with distinct languages.

14 September

Ezequiel Koile, Ilya Chechuro, George Moroz, Michael Daniel

Geography and language divergence: the case of Andic languages

Abstract

We study the correlation between phylogenetic and geographic distances for the languages of the Andic branch of the East Caucasian (Nakh-Daghestanian) language family. For several alternative phylogenies, we find that geographic distances correlate with linguistic divergence. Notably, qualitative classifications show a better fit with the geography than cognacy-based phylogenies. We interpret this result as follows: the better fit may be due to implicit geographic bias in qualitative classifications and conclude that approaches to classification other than those based on cognacy run a risk to implicitly include geography and geography-related factors as one basis of genealogical classifications.

7 September

George Moroz, Timofey Mukhin, Chiara Naccarato and Samira Verhees

Update: Typological Atlas of the Languages of Daghestan

Abstract

In this talk we introduce the recent updates made to the Typological Atlas of Daghestan, which include new topics and new visualizations. We would also like to use this opportunity to discuss how to turn the atlas into a resource with chapters and data that are both easy to use, cite and find on the one hand, and easy to edit and update on the other hand. During the talk we also will discuss a new phonological database of East Caucasian languages and patterns that it reveals. We will discuss the distribution of the following phonological features: inventory size, gemination, labialisation, laterals, nasal vowels, long vowels and briefly discuss correlation between elevation and inventory size (sorry for those of you, who have seen this on SLE conference).

15 June

Polina Nasledskova, Tatiana Philippova

Postpositions in Nakh-Daghestanian

Abstract

In this brief talk we will report on our ongoing project devoted to a general description of postpositional systems in the Nakh-Daghestanian languages. In particular, we shall talk about their case government properties and the ability to function as adverbs. At the end we will present ideas concerning our prospective contribution to the Typological Atlas of the languages of Daghestan.

George Moroz

Comparative Andic dictionary database: history of creation

Abstract

During the last two years, we worked together with Arseniy Averin, Anastasia Davidenko, Ilya Sadakov, Zlata Shkutko, Grigory Kuznetsov, Anna Tsysova, Wanshu Zhang on digitalisation of the Andic dictionaries. During compilation of the database we also worked on several subprojects on comparative phonology, colexicalisation and morphology of plural nouns forms. During the talk I would like to present the database and briefly discuss some preliminary results of the conducted research.

8 June

Anastasia Panova

Towards a typology of continuative expressions

Abstract

This study investigates how continuative semantics is encoded cross-linguistically. The work is based on two independent language samples: a sample with global coverage and an intragenealogical sample of four Northwest Caucasian (Abkhaz-Adyge) languages. The cross-linguistic sample is genealogically and geographically balanced and includes 120 languages. Means that convey continuative semantics — continuative expressions — are analyzed according to the following parameters: morphosyntactic type (affix, auxiliary, adverbial phrase), degree of grammaticalization, tense-aspect-actionality restrictions on the predicate, non-continuative uses of the continuative expressions and semantic effects when combined with negation. The data come mainly from secondary sources (grammatical descriptions and dictionaries) and parallel texts. The second part of the study focuses on the intrageneological typology of continuative expressions in the following Northwest Caucasian languages: Abaza, Abkhaz, Kabardian and West Circassian (Adyghe). The main sources for the study of continuative expressions in Northwest Caucasian are elicited data and parallel texts. Based on the results of the macro-typological and intrageneological studies and their comparison, I suggest that two typological clusters or profiles of continuative expressions can be distinguished — predicative and adverbial, and that continuative expressions belonging to different classes show different degrees of diachronic stability.

1 June

Nikita Muravyev & Daria Zhornik

Solving the puzzle of the Ob-Ugric passive

Abstract

In this talk, we look at the active/passive voice alternation in two Ob-Ugric languages of Western Siberia, Northern Khanty and Northern Mansi. This alternation has been described in the literature as primarily motivated by information structure: a sentence appears in active whenever an Agent is the primary topic of the sentence, otherwise passive voice is used (Kulonen 1989, Nikolaeva 2001). However recent text and elicitation data suggest that a purely information-structure based approach has a number of shortcomings. First, passive can be used if an Agent is topical yet low in animacy and/or definiteness. Second, focused Agents are allowed in special kinds of active sentences, e.g. interrogative contexts. Moreover, passivization is possible with a great variety of intransitive verbs with no Agent role whatsoever, including state verbs and verbs denoting spontaneous change of state. Also intransitive verbs can be passivised in adversative contexts in which some discourse participant external to the event gets affected in some way. These facts posit a problem both for the abovementioned information-structural approach and for the existing typological accounts of the active/passive alternation. We will discuss these facts in detail, compare the situation in Khanty and Mansi and present a model which helps at least partially solve the Ob-Ugric puzzle. Kulonen U. M. The Passive in Ob-Ugrian. Helsinki, Finno-Ugrian Society, 1989. Nikolaeva, I., 2001. Secondary topic as a relation in information structure. In: Linguistics, 39.1: 1–50.

25 May

Ilya Chechuro & Michael Daniel

Looking for areal convergence in nominal gender assignment in East Caucasian

Abstract

In this talk, we investigate whether the data on nominal gender assignment in East Caucasian - more specifically, Lezgic - languages show any evidence for areal convergence. To do so, we consider those Lezgic languages and their immediate neighbours that feature four-gender systems, including Budukh, Kryz, Rutul, Tsakhur and Archi, and compare them to Lak, Archi’s immediate neighbour, and Khinalug, immediate neighbour of Kryz and Budukh. In all these languages, Gender 3 and Gender 4 are semantically heterogeneous, so shared assignment may be due to (a) common inheritance, (b) areal convergence, or (c) pure chance. A quantitative analysis of gender assignment across the lexicon documented in Kibrik and Kodzasov (1990) suggests that Archi is more similar to its neighbour Lak than to any of its Lezgic cousins. No such result has been obtained by comparing Khinalug and its Lezgic neighbours Budukh and Kryz. We will discuss various methodological refinements we attempted to unravel the genealogical and areal signals, and to distill both of them from the impact of crude semantics. These attempts were purposefully based on the use of data external to East Caucasian (World Loanwords Database; Wordnet) but so far have not been successful - so we will ask for your ideas to improve our methodology.

18 May

Jérémy Pasquereau (University of Poitiers)

On tense, aspect, and evidentiality in Karata (East Caucasian, Karata village variety)

Abstract

Like other East Caucasian languages, Karata has elaborate verbal paradigms, in particular because of the high number of analytic constructions it uses. On the basis of a 40+-text corpus and Dahl’s 1985 TAM questionnaire, and building on previous work (Magomedbekova 1971, 1998, Magomedova & Xalidova 2001, Xalidova 2019), I present ongoing work aiming at describing the morphosyntax and the meanings of verbal forms in this language.

11 May

Samira Verhees

Karabagly - an Armenian village in Dagestan

Abstract

In this talk I will report on my two-day visit to Karabagly, a village in northern Dagestan (Tarumovsky district) that was originally mono-ethnic Armenian and presently still has a majority Armenian population. I will discuss some preliminary observations on the preservation of Armenian language and culture in the village, and the relationship of the Armenians with other local people as well as their historical homeland Armenia.

Nina Dobrushina, Michael Daniel, Kirill Koncha, Maxim Melenchenko

Tsudakhar - Lak contact: evidence from sociolinguistic field study in April 2021

Abstract

In this talk, we will briefly present the results of the sociolinguistic field study carried out in five adjacent Lak and Tsudakhar villages. We will focus on the Tsudakhar - Lak bilingualism and ethnic contacts, their main site being the Tsudakhar Monday market. Our attempt to observe communication at the Tsudakhar market will be discussed, with a brief reference to other markets of highland Daghestan. We will also mention Tsudakhar - Avar contact in the village of Karekadani.

20 April

Susanne Maria Michaelis (MPI-EVA, Leipzig)

Grammatical co-expression patterns in creoles and their parent languages: comitative and related functions

Abstract

In this talk, I will report on an ongoing project on grammatical coexpression patterns (or polysemy patterns) in creole languages and their parent languages, such as illustrated in examples (1)–(4). The Seychelles Creole polysemous marker (av)ek ‘with, and, by’ (< French avec ‘with’) is used to express four different grammatical functions: comitative (1), instrumental (2), passive agent (3), and noun phrase conjunction (4). (1) comitative Mon ‘n travay avek Sye Raim. 1SG PRF work com Mr Rahim ‘I have worked with Mr Rahim.’ (Bollée & Rosalie 1994:14f.) (2) instrumental Nou fer servolan nou file ek difil. 1pl make kite 1pl let.glide with thread ‘We made a kite and let it glide with a thread.’ (Michaelis 1994:66) (3) passive agent Mon’n ganny morde ek lisjen 1sg.prf pass bite pass.agent dog ‘I have been bitten by a dog.’ (Michaelis & Rosalie 2000:82) (4) noun phrase conjunction Mari ek Pyer ‘Mary and Peter’ When comparing the specific coexpression pattern of Seychelles Creole (av)ek with the patterns in its parent languages, it becomes clear that in French, the lexifier language, the marker avec ‘with’ only covers a subset of the meanings that the Seychelles Creole marker (av)ek covers, namely only comitative and instrumental. By contrast, the Passive agent is expressed by par ‘by’ in French, and noun phrase conjunction is expressed by the coordination marker et ‘and’. However, Makhuwa and other neighboring Bantu languages of East Africa (the most important cluster of substrate languages relevant for Seychelles Creole) show the same coexpression pattern as the one cited for Seychelles Creole. Here, the marker ni (van der Wal 2009:113) covers all four grammatical meanings that we saw for Seychelles Creole, comitative, instrumental, passive agent, and noun phrase conjunction. The hypothesis of the paper goes beyond Seychelles Creole: It extends to potentially all creole languages. I suggest that grammatical coexpression patterns in creoles are not randomly distributed, but they systematically reflect the grammatical coexpression patterns of their substrate languages, and much less so those of their lexifier languages. Here I investigate 10 creole languages from around the world (genealogically maximally distinct) and their parent languages for the grammatical markers expressing comitative, instrumental, and noun phrase conjunction (and related meanings). Recent literature (e.g. Baptista 2020) suggests that “convergence” of functions (and possibly forms) of the parent languages is a major driving force for shaping creole grammars. Indeed, at first glance the coexpression pattern of a grammatical marker ‘with’ in a creole language seems to mirror overlapping, convergent grammatical meanings between its lexifier and its substrate language(s). But a closer look at the grammatical coexpression patterns of similar ‘with’-markers in genealogically different creoles and their parent languages reveals that it is the coexpression patterns of the substrates that tend to be imposed on the nascent creoles, irrespectively of the degree of convergence of the lexifier patterns with those of the substrates and/or the creole. Thus, comitative, instrumental, passive agent and np-conjunction are shared by Makhuwa and Seychelles Creole, whereas French only converges in comitative and instrumental with both Makhuwa and Seychelles Creole. References Baptista, Marlyse. 2020. Competition, selection, and the role of congruence in creole genesis and development. Language 96:1, 160-99. Bollée, Annegret and Rosalie, Marcel. 1994. Parol ek memwar. Récits de vie des Seychelles. Hamburg: Buske. Michaelis, Susanne. 1994. Komplexe Syntax im Seychellen-Kreol: Verknüpfung von Sachverhaltsdarstellungen zwischen Mündlichkeit und Schriftlichkeit. Tübingen: Narr. Michaelis, Susanne and Rosalie, Marcel. 2000. Polysémie et cartes sémantiques: Le relateur (av)ek en créole seychellois. Études Créoles 23. 79-100. van der Wal, Jenneke. 2009. Word order and information structure in Makhuwa-Enahara. Utrecht: Netherlands Graduate School of Linguistics.

13 April

Aigul Zakirova

From noun plural to plural agreement: evidence from Andi dialects (and beyond)

Abstract

Noun plural markers sometimes grammaticalize into markers of plural agreement on various targets: e.g. Turkic –lar (Erdal 2004: 231 for Old Turkic, Matasović 2018 for Karaim), similar developmens can be postulated for the Adyghe -xe (Lander et. al. forthc.), and Nivkh -ɣun (Gruzdeva forthc.). The process of grammaticalization of noun plural marking into plural agreement marking on other types of targets has not, to my knowledge, been dealt with in typological literature. A way to compensate for this gap would be to describe scenarios of such evolution in particular languages and language groupings. Andi (Avar-Andic < Avar-Ando-Tsez < East Caucasian) presents an interesting case of grammaticalization of a plural marker -(V)l into a number agreement marker. I will address the question of how this mechanism of number agreement might have evolved. -(V)l is most probably the reflex of *li, one of the reconstructed Proto-Andic plural markers (Alexeyev 1988: 92-93). Whereas related Andic languages have an extensive list of plural markers that hardly have something in common, in most Andi dialects -(V)l was generalized as a nominal plural marker. The next step was the extension of -(V)l onto other word forms, i.e. targets of agreement, both inside the NP and onto verbal forms and adverbs. The behavior of -(V)l on different types of targets will be condisered in order to come to a plausible scenario. In a more descriptive vein, I will compare the -(V)l-agreement to the more “canonical” gender agreement, also present in Andi. Finally, I will briefly consider examples of similar developments in the related East Caucasian languages.

6 April

Thomas Wier (Free University of Tbilisi)

Squaring the circle in the Caucasus: Perspectives on Sprachbünde and Language Contact

Abstract

Linguists have long noted both the exceptional internal diversity of the Caucasus, but also that many of the features of languages found there are not found in immediately adjacent regions of Eurasia. In the last two centuries, the question has thus arisen more than once: to what extent do these unusual features arise from language contact, and to what extent can they be explained by other (phylogenetic, typological, or indeed statistically random) traits? In this lecture I will review three different sets of answers that have been proposed: Klimov (1965, 1973); Tuite (1998); and Chirikba (2008). After reviewing these arguments, I will suggest that while autochthonous Caucasian languages do share a quantitatively large number of phonological and morphosyntactic traits in common, qualitative similarities are more probative in answering the question of whether the region constitutes a true Sprachbund, and a better approach might be to distinguish micro- and macro-Sprachbünde.

30 March

Oleg Belyaev

Contact influences on Ossetic: A general overview

Abstract

In many ways, Ossetic has a unique status among languages of the Caucasus. Belonging to the Iranian branch of the Indo-European language family, Ossetic is the last living representative of Sarmatian varieties once widely spoken in the northern Black Sea region. Having long developed in isolation from other Iranian languages, Ossetic has, on the one hand, preserved a number of archaic features; on the other hand, it has developed unique innovations, some of which may be explained by language contact. The Ossetic lexicon, mainly being of Iranian origin, has a comparatively large share of loanwords from neighbouring languages, many of them in the basic lexicon. In phonology, a key contact-induced feature is the presence of ejective consonants, mainly in Caucasian loanwords. Some grammatical features of Ossetic (word order, case system, structure of complex clauses) may also be contact-induced. Therefore, the data of Ossetic are valuable both for the typology of language contact and the study of early contacts of Ossetians / Alans and other ethnolinguistic groups. In the talk, I will provide a general overview and discussion of lexical and grammatical features of Ossetic that may be contact-induced, and a preliminary analysis of which contact situations could have led to these results.

23 March

Anastasia Panova & Michael Daniel

Linguistic complexity across East Caucasian: from the eye of the beholder to corpus based measures

Abstract

Measuring complexity in typology is deemed relevant for the sociolinguistic take on language diversity, connecting complexity of language structures to such diverse but correlated factors as language size, its relative isolation, its L2 acquisition and multilingualism of its speakers. Yet on the empirical side, measuring complexity is difficult not only because the measures are sometimes calibrated in what may seem an arbitrary way, but also - and certainly not less importantly - because they depend on the analysis in a grammar. As one example, Kibrik (1977) counts over a million of synthetic verbal forms in Archi; but excluding verificative and especially quotative ‘series’ from inflectional morphology dramatically reduces this abundance. Similarly, measuring phonetic complexity based on the cardinality of inventories may deliver different stories depending on the approach; the status of [x] in Archi (only Russian loans) is very different from its status in Rutul (native lexicon); including or excluding rare allophones (Archi [ɮ]) and variants (Mehweb [ɣ]) that sometimes do and sometimes do not make their way into the descriptive inventories could in theory influence the outcomes of the quantitative comparison, and it is not absolutely obvious what can be the impact of these factors on the comparison. A way to avoid this would be (i) using shallow counts that minimize the analytical impact of language descriptions and (ii) making counts in corpora rather than deriving them from descriptive grammars. In this talk, after a brief survey of the existing corpus based approaches to measuring language complexity, we discuss several experiments we carried out to measure morphological and phonetic complexity across unannotated corpora of the languages of Daghestan. We (dual, exclusive) are very much looking forward to having feedback and suggestions as to how further develop this take.

16 March

Alexandra Vydrina

Multilingualism as a genre-structuring strategy: the case of Kakabe traditional narratives Alexandra Vydrina

Abstract

Various West-African language communities show the use of a specific type of code-switching that is limited to the genre of traditional narratives: songs that appear in such narratives regularly include passages that are in a language different from the principal language of the narration. This type of conventionalized multilingualism is a regular phenomenon that is recurrently found across languages of West Africa. However, so far, it has never been object to any systematic investigation. In my presentation, I will analyze this type of multilingual practice on the data of 70 Kakabe traditional narratives, investigating the specific mechanism of switching from one language to the other and its relation to the wider context of the type of multilingualism found in this speech community.

9 March

Manuel Padilla-Moyano (University of the Basque Country & Linguistic Convergence Laboratory, HSE)

Revisiting motion events in Basque

Abstract

Asymmetries in spatial relations have been described cross-linguistically [Stefanowitsch & Rohde 2004; Luraghi, Nikitina & Zanchi 2017; Kopecka & Vuillermet 2021]. Basque has a set of spatial cases, in which the ablative encodes Source and Path, and the allative conveys Goal. Additionally, there are both directional and terminative case-markers. In some dialects, this general tableau becomes more complicated, and historical records also provide additional complexity, such as an ancient dedicated perlative marker [Lafon 1948]. As Basque spatial cases can mark animacy, asymmetries in the encoding of motion events must also consider this parameter [Creissels & Mounole 2011; Krajewska 2021]. I will present an incipient study on the Source-Goal asymmetry, which will be part of comprehensive research on the evolution of the Basque case-system. Pursuing Zaika’s study [2016], I will analyze the behavior of verbs of motion, putting and posture, as well as the case-markers and non-grammaticalized postpositions they make appear. This work will consider dialectal variation, diachronic factors, and the role of language contact. To this end, I will exploit existing corpora and other materials, and collect new data from fieldwork with speakers of several dialects. References Creissels, Denis & Mounole, Céline (2011). Animacy and spatial cases: Typological tendencies, and the case of Basque. In Seppo Kittilä, Katja Västi & Jussi Ylikoski (Eds.), Case, Animacy and Semantic Roles (Typological Studies in Language 99), pp. 157–182 Amsterdam/Philadelphia: John Benjamins. Kopecka, Anetta & Vuillermet, Marine (2021). Source-Goal (a)symmetries across languages. Studies in Language 45(1). Krajewska, Dorota (2021). The marking of spatial relations on animate nouns in Basque: a diachronic quantitative corpus study [submitted to Journal of Historical Linguistics]. Lafon, René (1948). Sur les suffixes casuels -ti et -tik. Eusko Jakintza 2, 141–150. Luraghi, Silvia; Nikitina, Tatiana & Zanchi, Chiara (Eds.) (2017). Space in Diachrony. Amsterdam/Philadelphia: John Benjamins. Stefanowitsch, Anatol & Rohde, Ada (2004). The goal bias in the encoding of motion events. Zaika, Natalia (2016). Вариативность падежных форм при глаголах движения в баскском языке в диахроническом и диалектном аспектах. Acta Linguistica Petropolitana 12(1), 428–441.

4 March

Natalia Kuznetsova

Rare features in phonological typology

Abstract

The talk will touch upon theoretical aspects of existing and emerging accounts on rare features in phonological typology, in general, and in word-prosodic typology, in particular. Rarities can be ignored by linguistic theory, be reanalysed as regular, or be incorporated by changing the theory. Phonological rara and rarissima used to be rather ignored or reanalysed, but the trend seems to be changing, with always more data coming in from lesser-studied languages, on the one hand, and a strengthening interest of linguistic typology in geographic and evolutionary aspects related to the cross-linguistic distribution of linguistic features, on the other hand.

Giuliano Castagna

Modern South Arabian: archaism, innovation and contact in the Arabian peninsula

Abstract

It has long been known that the Modern South Arabian subgroup of the Semitic language family, made up of six endangered languages spoken in Oman and Yemen, exhibits a set of characteristics regarded by Semitic scholars as archaic, such as: large sound systems including lateral fricatives and affricates, and glottalised stops and affricates; productive subjunctive and conditional moods, as well as other characteristics that may be reminescent of classical Semitic languages, such as the reverse gender agreement between numerals and nouns, and the presence of second and third person feminine and dual pronouns. However, certain other features of these languages have not been analysed in detail by mainstream Semitic literature. In fact, some of these features have not been discussed at all: for example, the presence of a first person dual pronoun, and the apparently non-Semitic facies of a sizeable part of Modern South Arabian lexis. Moreover, the unexplained relationship between Modern South Arabian languages and a huge amount of undeciphered epigraphs found mostly in caves and on rocks and boulders, calls for further studies. These epigraphs employ a modified version of the south Semitic script, and are found not only in the present-day range of Modern South Arabian, but also further north-east into Oman proper. This presentation aims at providing a general introduction to the Modern South Arabian languages, and highlighting the above-mentioned issues, as well as advancing some working hypotheses.

2 March

Tim Zingler

Form and function in morphological typology

Abstract

The goal of linguistic typology is to understand the interactions of form and function in the languages of the world. Typically, investigations conducted in this research paradigm take a certain functional domain (e.g., ‘causative/applicative’) as the starting point and subsequently analyze by which formal means it is expressed. In this talk, though, I will argue that typology can also benefit from following the opposite approach, that is, by focusing on a specific type of linguistic form (e.g., infixation) and analyzing which functions it encodes.The major advantage of the latter strategy is that linguistic forms are ultimately less variegated than linguistic functions, which facilitates comparison.This strategy will then allow typologists to develop a more nuanced theory of morphology and to account for areal patterns that manifest themselves in the distribution of linguistic forms. In order to support these claims, I will draw on novel research on the suffixing preference.

Riccardo Giomi

A Functional Discourse Grammar typology of reflexives, with some notes on reciprocals

Abstract

This chapter presents the first-ever Functional Discourse Grammar typology of reflexives and opens the way to a comparable typology of reciprocals. The main finding of the paper is that the striking morphosyntactic diversity of reflexive markers can be reduced to only three basic classes, which differ as regards the structure of the predication frame on which the construction is built. In Type I reflexives the lexical predicate takes two coindexed arguments; Type II reflexives are based on a one-place frame in which the predicate bears a reflexive (or reflexive/reciprocal) operator; finally, Type III reflexives are characterized by the presence of a configurational predicate which takes both an external and an internal argument. All further differences are explained with reference to different ways of aligning the underlying pragmatic and semantic structures of each construction-type – more specifically, the number and information-structural status of referents at the Interpersonal Level and the number and structural position of verb arguments at the Representational Level. A further advantage of the proposed typology is that of accounting for possible differences in the lexical distribution of reflexive markers on the basis of the notion of partially instantiated predication frames, i.e. partially lexicalized constructional templates of the Representational Level.

Julie Marsault

The prefixal template of Umóⁿhoⁿ: case study of the “dative” prefix

Abstract

Umóⁿhoⁿ (Siouan), a highly endangered Native American language spoken in the United States, possesses a highly complex verbal morphology, in particular a series of arbitrarily ordered derivational and inflectional prefixes. After a brief introduction to the language, I will present the verb’s prefixal template, then focus on the prefix gí-, usually called “dative”. The case study of gí- covers several key issues of Umóⁿhoⁿ morphology: (1) change of slot of person marking triggered by the presence of other prefixes; (2) multiple exponence of the dative and of person marking; (3) semantic demotivation and lexicalization of the prefixes. Building on these developments, I will show that the dative prefix exhibits both inflectional and derivational characteristics.

16 February

Timur Maisak

Towards the Nakh-Daghestanian Lexicon of Grammaticalization Timur Maisak

Abstract

What will be discussed in the talk is not an accomplished or even an ongoing project, but rather a general idea of creating a lexicon of grammaticalization for Nakh-Daghestanian languages. I will start with an overview of existing lexicons of grammaticalization (which are very few) and how they can serve as source of inspiration for the Nakh-Daghestanian Lexicon. I will then present example entries of the future Lexicon and mention the choices and the problems one has to face when creating such a Lexicon. Comments and suggestions from the audience will be most welcome.

9 February

Tatiana Philippova

Adpositions and case: Categorial issues Tatiana Philippova

Abstract

This talk will address the issue of the categorial status of case markers and adpositions from a cross-linguistic perspective. I will present some major research questions arising in this respect, including the following: • How do we approach the case/adposition delineation problem in languages with case suffixes and postpositions? • Is it feasible to posit a cross-linguistically uniform category of postpositions? If yes, is it principally distinct from that of prepositions? • Can we meaningfully compare language-specific categories of case and adpositions across languages? Having introduced these questions, I will give an overview of the state of the art in research on this and related topics. Please note that this will be an overview talk, rather than one showcasing the results of my own research. And I expect that there will be plenty of room for discussion!

2 February

Aleksandra Trepalenko, Timur Maisak

Towards the corpus of Bagwalal dialects Aleksandra Trepalenko, Timur Maisak

Abstract

In the talk I will prBagwalal is a small and underdescribed language of the Avar-Andic branch of the Nakh-Daghestanian family. After a general introduction about the language and the history of its research (Timur Maisak), we shall present the ongoing project on the glossing of Bagwalal dialectal texts (Aleksandra Trepalenko). The texts first published in Gudava’s (1971) grammar in Georgian represent all six villages where Bagwalal is spoken. We are going to present the results of our analysis of the texts (glossing, translation), mention the main dialectal differences and describe some interesting features of Bagwalal and problems we faced during our work.

26 January

Sofia Oskolskaya

On typology of caritive constructions Sofia Oskolskaya (Institute for Linguistic Studies)

Abstract

In the talk I will present the project “Grammatical periphery in the languages of the world: a typological study of caritives”. Caritive (aka abessive) expresses the non-involvement of a participant into a situation, with the non-involvement predication semantically modifying the situation or a participant of a different situation, like in English Mary came without John / money. The project aims at studying the means of expression of caritive meanings in the languages of the world. We developed a questionnaire and collected data from a representative sample of 100 languages. I am going to discuss the methodology of the project: the definition of caritive, questionnaire, methodology of collecting data. The project is still in progress, but I will present some preliminary results.

19 January

Chingduang Yurayong (Mahidol University)

Postposed -to in North Russian dialects through the lens of Finnic languages in contact

Abstract

The use of demonstrative-derived morphemes in the head-following position is characteristic of North Russian dialects (-to and its variants -ta, -tu, -ti, -te, …) and eastern Finnic languages (-se [singular] and -ne [plural]), such as Olonets Karelian, Lude, and Veps. In terms of function, some previous studies regard these grammatical elements as definite articles, while other recent studies identify additional functions related to information structure and discourse. Given that an equivalent construction is not observed in Belarusian and Ukrainian, and -to in other Russian dialects mostly remain invariable, several studies propose that declinable -to in North Russian could have resulted from language contact with the Uralic-speaking population who adopted Russian as their second language, particularly Finnic speakers. Of many aspects of the research question, this presentation will focus on exploring contexts of use of -to in North Russian and -se in Finnic from the perspectives of referentiality, information structure, and evaluation. Potential development paths will also be discussed by paying attention to the Russian-Finnic contact scenario during the past millennium.

12 January

Timofey Mukhin, Chiara Naccarato, Samira Verhees

Dagatlas Update (Timofey Mukhin, Chiara Naccarato, Samira Verhees)

Abstract

We will give a short update about the project Typological Atlas of Daghestan covering the progress we have made so far. We will discuss our plans to publish the resource and a (partially) new approach to data visualization.

Polina Nasledskova

Borrowed postpositions in East Caucasian

Abstract

Grammar descriptions of East Caucasian languages include information about borrowed postpositions. I attempt summarizing the data on borrowed postpositions (both between branches of East Caucasian family and into East Caucasian from languages of the other families). I suggest contact origins of some postpositions whose diachrony is unclear from the sources. I will also provide an overview of the typology of borrowed postpositions. I ultimately aim at correlating borrowing of postpositions in East Caucasian with other contact-induced changes in the languages of the family. This presentation is a preview of a study, not a final analysis of the data.

Seminar schedule 2020

22 December

Sergey Say

Using BivalTyp (www.bivaltyp.info) for measuring (dis)similarities between valency class systems (Sergey Say)

Abstract

The goal of my presentation is two-fold. In the first part, I am going to introduce BivalTyp (www.bivaltyp.info) — a typological database of bivalent verbs and their encoding frames. This database contains information on the ways in which 130 bivalent contextualized predicates (such ‘be afraid’, ‘listen’, ‘touch’) are assigned to valency classes in 85 languages (mostly spoken in Eurasia). This part of the presentation will be user-oriented, i.e., I will focus on the ways data are processed, stored and visualized in the database. In the second part, I will briefly discuss some of the ways in which BivalTyp enriches our knowledge of the ways in which arguments select encoding devices (such as cases, adpositions and verb indices) in individual languages. In particular, I will argue that the very partition of verbs into valency classes can be used as a justified tertium comparationis in cross-linguistic studies of argument encoding. I will also introduce some distance metrics that can be applied to the data from BivalTyp and will discuss genealogically and areally determined similarities between valency class systems in the languages of the sample.

15 December

Ksenia Shagal, University of Helsinki

Multifunctional non-finites in Northern Eurasia (Ksenia Shagal, University of Helsinki)

Abstract

In this talk, I am going to discuss patterns of multifunctionality that are characteristic of non-finite forms in 50 languages of Northern Eurasia. Specifically, non-finite forms are investigated in terms of the inventory of functions each of them can perform when heading a subordinate clause: reference function (complement clauses), adnominal modification (relative clauses), and adverbial modification (adverbial clauses). The primary questions I will address are the following: (a) What patterns of multifunctionality in non-finites are most common and how are they distributed geographically across Northern Eurasia? (b) Do patterns of multifunctionality differ depending on how prominent non-finite subordination is in a language? (c) Are there any recurrent patterns involving specific constructions, and if yes, can we propose an explanation for their occurrence?

8 December

Ilya Chechuro

Good Practices for Linguistic Data (Ilya Chechuro)

Abstract

This talk is devoted to the practices that make linguistic data findable, accessible, interoperable, and reusable (FAIR). First, I will introduce some general guidelines for data structures, file formats, and data description. Then I will touch upon the issues related to orthographic systems and discuss the problem of orthographic ambiguity. The first part will be concluded by the discussion of Cross-Linguistic Data Formats (CLDF) and meta-databases such as Glottolog, CLLD and Concepticon. Together these tools form a framework that attempts to facilitate data standardization and sustainable storage. The second part of the talk will deal with data sharing. I will propose several tools for increasing reproducibility of programming code. I will also discuss version control with Git and academic licences. Finally, I will briefly introduce the tools that are useful when submitting a paper: Open Science Framework (OSF.io) and Zenodo.

1 December

E.Rakhilina, T.Reznikova, D.Ryzhova

Lexical systems with systematic gaps: verbs of falling (E.Rakhilina, T.Reznikova, D.Ryzhova)

Abstract

The paper presents the results of a project on cross-linguistic analysis of FALLING verbs in more than 40 languages. The main possible oppositions and patterns of colexification in lexical systems are described in the framework of Moscow lexical typology group (Rakhilina, Reznikova 2016). Though in most languages this semantic field appears to be rich, our research did detect language systems without dedicated verbs of falling. We argue that these cases are neither accidental nor culture-specific, but can be seen as following from some fundamental semantic principles.

17 November

Ekaterina Kapustina

Особенности функционирования дагестанских транслокальных сообществ в условиях внутрироссийской миграции (Ekaterina Kapustina)

Abstract

В докладе анализируется современное устройство и функционирование дагестанских сельских сообществ, члены которых участвуют во внутрироссийской миграции (в качестве примера выбрана миграция в города Западной Сибири). В качестве теоретической линзы были выбраны положения концепций транснационализма и транслокальности, которые позволяют рассматривать мигранта и его социальный мир без отрыва от его отправляющего сообщества, джамаата. Ориентация на сохранение приоритета сельской локальности при переселении за пределы села и республики Дагестан, поддержание транслокальных связей формируют новый социальный организм – мультилокальное сообщество – о специфическом функционировании такого рода сообществ и пойдет речь в сообщении. В основу работы положен полевой материал автора, собранный в городах Ханты-Мансийского автономного округа и в Республике Дагестан в 2011-2019 гг.

10 November

Damian Blasi

An ancient history bottleneck for linguistic diversity and its consequences for linguistic typology (Damian Blasi)

Abstract

In this presentation I will discuss the relation between linguistic diversity and basic units of human organization in pre-agricultural, nomadic and forager societies. On the basis of those patterns I will discuss existing hypotheses on the previous stages of linguistic diversity (from early Holocene until today), and I will provide evidence for a relatively brief period of massive linguistic diversity from 4-1 kybp. I will conclude by spelling out the practical consequences of this finding for typological and historical linguistic generalizations.

3 November

Anna Azanova

Clitics li and chi in Rogovatoye and Spiridonova Buda dialects: functions and positional properties (Anna Azanova)

Abstract

In the dialects that I will be talking about, the repertoire of function words such as conjunctions and particles is considerably different from the standard variety of Russian. Thus, they have clitic chi that is considered to have the same functions as Russian li. So, my first research question is what is the distribution of the functions between these quasi-synonymous clitics in the dialects that have both of them? Secondly, I want to talk about their positional properties, depending on the function they have. There were many studies of Slavic clitics in the standard languages, but none (as far as I know) considered dialect data, and that’s what I’ve tried to do.

27 October

Nina Dobrushina

Optatives in Nakh-Daghestanian and beyond (Nina Dobrushina)

Abstract

Inflectional optatives - dedicated forms to express the speaker’s wish - are typical across the Caucasus. In this talk, I give an overview of optatives in Nakh-Daghestanian languages and discuss their possible diachronic sources and grammaticalization paths. I also argue for the contact as one reason for the areal spread of the optatives and suggest their prominent role in everyday discourse as a possible reason for this spread.

20 October

Chiara Naccarato

The standard of comparison in the languages of Daghestan (Chiara Naccarato)

Abstract

In this talk I will present the results of my research on the standard of comparison in the languages of Daghestan, which was started as part of the DagAtlas project (the “Typological Atlas of the Languages of Daghestan”). In the languages of Daghestan, the standard of comparison is usually expressed by a spatial form, i.e. an inflected form of a nominal normally expressing a spatial relation. In this study, I classify the languages of Daghestan according to the type of spatial form used to mark the standard of comparison. Following the methodological approach of the DagAtlas project, I collected the data from the available literature and built maps for the visualization of results. The results obtained are discussed both in terms of frequency and distribution within the linguistic area under investigation, and in comparison with broader typological investigations of comparative constructions (Stassen 1985, 2013), which include almost no reference to data from Daghestan. The latter comparison does not reveal surprising findings: the Daghestanian data adhere quite well to the cross-linguistic picture (with a general preference for elative markers). Within Daghestan, the overall picture seems a bit fuzzy, and the distribution of values on maps does not allow to detect any noteworthy areal or genealogical clustering. An exception is constituted by Andic languages, which form a cluster based on the localization marker employed (forms in -č’- indicating contact with some entity).

13 October

Natalya Stoynova

Inter-speaker variation in code-switching in the situation of language shift. The case of Nanai and Ulch (Natalya Stoynova)

Abstract

In this talk I will present some quantitative data on different structural types of code-switching attested in oral texts in Nanai and Ulch (Southern Tungusic). These texts represent a specific mode of code-switching between Nanai/Ulch and Russian observed in the situation of language shift. Speakers were instructed by the linguist to tell something in their native language, and this was an unusual and artificial way of communication, since both languages are endangered and the dominant language of the speech community is Russian. All the texts contain a lot of Russian fragments of different sizes and morphosyntactic types. I will focus on inter-speaker variation. There is a general assumption that inter-speaker variation increases in the situation of language shift. My preliminary observation is that this, particularly, concerns structural types of code-switches attested in the texts under discussion. First, I will check this observation. Second, I will show that intensity and preferred structural types of code-switching correlate with general narrative habits and skills of a speaker.

6 October

Ilya Chechuro, Michael Daniel, Ezequiel Koile and George Moroz

Сorrelations between linguistic distances with geography in Daghestan (Ilya Chechuro, Michael Daniel, Ezequiel Koile and George Moroz)

Abstract

We continue our project of looking for correlations between linguistic distances with geography in Daghestan, an area of high language density and mountainous terrain. We are trying to detect the impact of landscape on linguistic divergence by comparing correlations of linguistic distances with Great Circle (“crow flight”) distances vs. distances calculated taking the terrain into account. This time we expanded our dataset to include Tsezic. We are trying to find ways to solve the problem of the geographic data being so much richer in datapoints than the documented village lects; and of combining slightly different data (such as Swadesh list vs. Jena lists) into a single count. We will tell you about our progress in the last few months in terms of data cleaning, playing with models and kicking each other. Still very much work in progress.

29 September

George Moroz

Phonetic fieldwork and experiments with the phonfieldwork package for R: rOpenSci review (George Moroz)

Abstract

There is a lot of different tasks that typically have to be solved during phonetic research. They include creating slides that would contain the stimuli, renaming and concatenating multiple sound files recorded during a session, automatic annotation in ‘Praat’ TextGrids (one of the sound annotation standards provided by ‘Praat’ software, see Boersma & Weenink 2018), creating an html table with annotations and spectrograms, and converting multiple formats between each other (‘Praat’ TextGrid, ‘EXMARaLDA’, ‘ELAN’, subtitles .srt, and .txt from Audacity). All of these tasks can be solved by combining different tools (relabeling is straightforward, Praat contains scripts for concatenating files, etc.). R package phonfieldwork provides a functionality that makes these tasks easy to solve without additional tools, and also as compared to other packages: rPraat, textgRid. During the talk, I will show how the package works and what it can do, explain some changes that were proposed by rOpenSci reviewers and will take your ideas for improvement. The tutorial is available online.

22 September

Ilya Sadakov

A corpus of Tsnal Lezgian (Ilya Sadakov)

Abstract

Lezgian is a language of the Lezgic branch of the Nakh-Daghestanian language family. Lezgian dialects are subdivided into the Küre dialect group, the Axceh dialect group and the Quba dialect group. The object of my investigation is spoken in the village of Tsnal of the Khivsky district in the Republic of Dagestan and belongs to the Jark’i dialect of the Küre dialect group (Mejlanova 1964). In this talk I am going to present the Tsnal Spoken Corpus I am working on and to discuss some of my early findings on the Tsnal variety of Lezgian.

15 September

Timofey Mukhin

From spatial deixis to anaphora: data from Lezgic and Tsezic (Timofey Mukhin)

Abstract

Crosslinguistically, demonstratives, in addition to their primary, deictic function, often acquire anaphoric uses. East Caucasian languages have rich inventories of demonstrative pronouns involving not less than three different stems, but do not have dedicated 3rd person pronouns; instead, they use demonstratives. The main goal of my study is to examine which demonstratives are recruited in anaphoric function by obtaining corpus counts, separately for adnominal and independent uses. The study is based on narrative corpora of several languages from the Lezgic and Tsezic branches. I conclude that even closely related languages may show divergent behavior.

8 September

Alexandra Vydrina

Reflexive and the generic use of the second-person pronoun in Kakabe (Alexandra Vydrina)

Abstract

Kakabe (Mande) has a reflexive pronoun with an unusual restriction on its antecedent. It cannot appear with referential nouns and pronouns, where regular personal pronouns are used instead. It does appear, however, with generic and quantified subjects, as well as in infinitival clauses. It also appears in correlative clauses with a relativized subject. I provide an account of this unusual distribution, situate the Kakabe data in the broader typological context, and discuss a possible diachronic path from the second-person pronoun involved in the development of the unusual reflexive pronoun in Kakabe.

30 June

Nikita Muravyev

Verbal agreement and voice in the Uralic languages of Western Siberia (Nikita Muravyev)

Abstract

Uralic (Ob-Ugric, Samoyedic) languages located in Western Siberia mostly exhibit a pragmatically-driven verbal agreement system whereby argument indexing on the verb depends on topicality of the core arguments. As shown in (Nikolaeva 2001; Dalrymple & Nikolaeva 2011) for Obdorsk Khanty (Northern) and Tundra Nenets, languages tend to use a Subject agreement paradigm for the Topical A > Focal O setting and a special Subject-Object paradigm for Topical A > Topical O and sometimes also for Focal O > Topical O. Additionally, some languages use inflectional Passive (Inverse) forms for Focal O > Topical A. However a deeper look into at least some languages of this area reveals that the usage of similar agreement and voice forms can depend not only on information structure but on a number of other factors, such as referentiality, animacy, number, assertiveness etc. partly resembling hierarchical indexation systems found in the Americas, South Asia and Australia, see e. g. (Zúñiga 2006). In the first part of the talk I will present my own field data from Kazym Khanty (Northern) with a more intricate verbal agreement system based on topicality and definiteness, compared to the situation in Obdorsk dialect. In the second part I will discuss an initial stage of a comparative areal research done by our project team in an attempt to shed some light on this phenomenon and its underpinnings across Khanty dialects as well as in Mansi and Tundra Nenets based on the text data available. Despite very limited size and time depth of existing corpora and text collections and yet a rather small amount of text material annotated and discussed by our team, the data show several tendencies that allow us to assess the overall situation in the region and to speculate about possible diachronic evolution of agreement and voice systems in the languages under investigation. References: Dalrymple, M., and Nikolaeva, I., 2011. Objects and information structure. No. 131. Cambridge University Press. Nikolaeva, I., 2001. Secondary topic as a relation in information structure. In: Linguistics, 39.1: 1–50. Zúñiga, F., 2006. Deixis and alignment: Inverse systems in indigenous languages of the Americas. Vol. 70. John Benjamins Publishing.

23 June

Ezequel Koile, Michael Daniel, George Moroz, Ilya Chechuro

Quantitative Linguistic Geography of Daghestan (Ezequel Koile, Michael Daniel, George Moroz, Ilya Chechuro)

Abstract

We study how geographical factors shape the distribution of languages spoken in Daghestan. An interdisciplinary approach is developed, involving linguistic data, methods based on geographic information systems, and statistics. Using wordlists with the best available granularity, and geolocation data from the Atlas of Multilingualism in Dagestan, we build a geospatial inference modelling for explaining the linguistic diversity of the area. Our project has two stages: (i) A synchronic mapping of the correlation between geographic and linguistic distances, and (ii) a diachronic reconstruction of the speakers’ dynamics, driven by phylogeny and contact events. In this talk, very much a work in progress and an opportunity to discuss the aim and the data, we will only cover the first stage, focusing on the area of North Daghestan where Andic languages are spoken.

16 June

Konstantin Filatov

Anchiq Karata indicative morphology. Allomorphy, inflectional classes and possible diachronic puzzles (Konstantin Filatov)

Abstract

In this talk I am going to present some results of my fieldwork in 2019: a fragment of description of Anchiq Karata (Andic, Avar-Andic-Tsezic, Nakh-Daghestanian) verbal morphology – the system of indicative verb forms. In the first part of my talk I am going to discuss Time-Aspect markers in three subparts of the paradigm, namely Perfective, Imperfective and Infinitive subsystems. The second part is dedicated to the procedure of establishing and explication of inflection classes. I am going to account for three major verbal inflectional classes (Conjugations) and a few smaller inflectional subclasses of morphophonological nature. Some morphological irregularities in verbal inflection are also going to be surveyed. The third part includes several diachronic questions that Anchiq Karata data on indicative morphology could elucidate in the context of divergence of Central-Andic languages.

9 June

Timofey Arkhangelskiy, University of Hamburg

Borrowings, frequency and lexical change (Timofey Arkhangelskiy, University of Hamburg)

Abstract

In this talk, I am going to explore the relations between borrowability, frequency of use and the dynamics of lexical change, based mostly on corpus data. The talk will have two parts. In the first part, I am going to look at frequency distributions of borrowings in the vocabularies of several languages. As I will show, a simple observation that more frequent words are less likely to be (recent) borrowings captures a facet of a much less trivial correlation. Specifically, the probability of a given word being a borrowing increases proportionally to the logarithm of its frequency rank in a sufficiently large corpus. The second part deals with the question of verbal borrowability. It has been long conjectured that verbs are more difficult to borrow than nouns. I am going to demonstrate that a recent empirical proof of that hypothesis by Tadmor et al. (2010) contains a logical fallacy because it implicitly equates borrowability and diachronic instability. I am going to provide an independent argument in favor of the hypothesis of greater diachronic stability of verbs compared to nouns. Nevertheless, it is not clear a priori whether this is a consequence of their lower borrowability or an independent phenomenon. In the latter case, it could alone cause the discrepancy between the observed proportions of borrowings among verbs and nouns, upon which Tadmor et al. based their argument.

2 June

Oleg Belyaev, Michael Daniel

Alternative recipient marking in Ossetic: Once-in-a-lifetime clearest case of contact induced change (Oleg Belyaev, Michael Daniel)

Abstract

Ossetic regularly allows marking recipients in ditransitive constructions using either dative or allative case, a kind of variation that closely corresponds to the distribution of dative vs. lative recipients in East Caucasian languages. We show that the semantic motivation for the choice of marking can be described in terms of transfer of ownership vs. spatial transfer; key evidence is provided by the distribution of the two strategies with instances of the verb ‘give’ containing and not containing spatial prefixes. As the phenomenon is not attested elsewhere in Iranian and seems to be extremely rare cross-linguistically, it is more than likely that this feature of Ossetic developed as a result of language contact with Nakh.

26 May

Aigul Zakirova

The emphatic particle =gu in Andi dialects (Aigul Zakirova)

Abstract

Andi =gu is an emphatic / intensifying enclitic. Beside contexts where it indicates some kind of contrast / emphasis, =gu is found in combination with many types of hosts where its contribution is less clear. With some hosts =gu is obligatory (e.g. cardinal numbers), with others it is optional. Similar enclitics have been observed for other Avaro-Andic and Tsezic languages, cf. Forker 2015 for Avar, Kibrik et. al. 2001: 713 for Bagvalal. In this study I employ natural texts to answer the following questions: what is the functional range of the Andi =gu? The contexts identified in Avar will then be compared to those discussed in Forker 2015.

19 May

Aleksandrs Berdicevskis, The Swedish Language Bank, University of Gothenburg

Native speakers simplify their language when writing to non-natives on an internet forum (Aleksandrs Berdicevskis, The Swedish Language Bank, University of Gothenburg)

Abstract

It is often claimed that large proportion of non-native speakers in a population facilitates morphological simplification (Trudgill 2011). There exists evidence in favour of this claim, but much is still unclear about the actual mechanism of simplification. Atkinson, Smith & Kirby (2018), relying on the evidence provided by artificial language-learning experiments, hypothesize that an important role is played by the interaction between speakers, primarily accommodation by more proficient speakers to less proficient ones. It is reasonable to expect that the most prominent case of accommodation would be foreigner-directed language (that is, accommodation by L1 to L2 speakers). I test this hypothesis using the resource described in (Berdicevskis 2018), a collection of large corpora of both L1 and L2 natural written production in four languages (English, French, Italian and Spanish), downloaded from WordReference forums. If the hypothesis about the simplicity of foreigner-directed language is correct, we can expect that L1 speakers would use simpler language when responding to messages posted by L2 speakers. I show that in most cases this is true. In the talk, I discuss to what extent these results support the accommodation hypothesis and, more broadly, general theories about the adaptation of languages to socio-cultural environments. References: Berdicevskis, A. (2018). Do non-native speakers create a pressure towards simplification? Corpus evidence. In Cuskley, C. et al. (Eds.): The Evolution of Language: Proceedings of the 12th International Conference, 41–43. doi:10.12775/3991-1.007 Atkinson, M., Kirby, S. & Smith, K. (2018). Adult learning and language simplification. Cognitive Science 42: 2818–2854. doi:10.1111/cogs.12686 Trudgill, P. (2011). Sociolinguistic typology: social determinants of linguistic complexity. Oxford: Oxford University Press.

12 May

Diana Forker

Elevation as a grammatical and semantic category of demonstratives (Diana Forker)

Abstract

In this talk, I study semantic and pragmatic properties of elevational demonstratives by means of a typological investigation of 50 languages with elevational demonstratives from all across the globe. The four basic verticality values expressed by elevational demonstratives are up, down, level, and across. They can be ordered along the elevational hierarchy (up > down > level/across), which reflects cross-linguistic tendencies in the expression of these values by demonstratives and is grounded in our cognitive representation of the vertical axis and the special position of the ‘vertical positive region’. Elevational values are frequently co-expressed with distance-based meanings of demonstratives, and it is almost always distal demonstratives that express elevation, whereas medial or proximal demonstratives can lack elevational distinctions. This means that elevational demonstratives largely refer to areas outside the peripersonal sphere in a similar way as simple distal demonstratives. In the proximal domain, fine grained semantic distinctions such as those encoded by elevational demonstratives are superfluous since this domain is accessible to the interlocuters who in the default case of a normal conversation are located in close proximity to each other. I then discuss metaphorical extensions of elevational demonstratives to non-spatial uses such as temporal and social deixis. There are a few languages in which elevational demonstratives with the meaning up express the temporal meaning future, whereas the down demonstratives encode past. This finding is particularly interesting in view of the widely-debated use of Mandarin Chinese spatial terms ‘up’ for past events and ‘down’ for future events, which show the opposite metaphorical extension. I finally examine areal tendencies and potential correlations between elevational demonstratives and the geographical location of speech communities in mountainous areas such as the Himalayas, the Papuan Highlands and the Caucasus. I conclude that the data from elevational demonstratives do not support the Topographic Correspondence Hypothesis because languages spoken in similar topographic environments do not tend to have similar systems of elevational demonstratives if they belong to different language families.

28 April

Gilles Authier

Verbal morphological complexity in Lezgic languages (Gilles Authier)

Abstract

The Lezgic branch of East Caucasian, which comprizes about twelve distinct languages, is very diversified typologically, in particular in the verbal morphology. Lezgic verbal systems grammaticalise variable sets of categories, and show differents types and levels of complexity. Based on an overview of the attested morphological realisations of these verbal categories in all Lezgic languages, our presentation will endeavour to link probable cases of increased and decreased complexity in verbal systems (judging form what we can hypothesize about the proto-Legic verbal system) with two main socio-linguistic considerations: the size of linguistic communities in diachrony and the influence of contact with non-Lezgic languages.

21 April

Daniel Wilson

Initial report on documentation of Sagada, Tsez Language technology for endangered languages (Daniel Wilson)

Abstract

I am a research fellow at the University of the Free State in Bloemfontein, South Africa and now working with the Department of Caucasian Languages at the Institute of Linguistics, Russian Academy of Sciences. I also work for XRI, a research institute which was designed to bridge the gap between academia and humanitarian development initiatives. This talk has two parts. First, I will be presenting an update on the status of my research on the Sagada dialect of Tsez, which I began last summer. The Tsezic languages are a sub-branch of the Nakh-Daghestanian (or East Caucasian) language family. The Tsezic languages are divided into two groups: the East Tsezic group (Bezhta and Hunzib) and the West Tsezic group (Tsez, Hinuq, Khwarshi, and Inkhowari). There is consensus in the prior research on Tsez that it should be divided into two main dialects: Tsez and Sagada, with further dialectal variation within Tsez (Imnaishvili 1963, Radjabov 1999, Abdulaev 2011, Comrie 2007, Polinsky 2015, etc.). The main division between Tsez and Sagada has been made based primarily on the variation noted by Imnaishvili in the middle of the 20th century. The data collected by Imnaishvili have provided most of the present knowledge about Sagada. It has even been noted that Sagada may rightfully be considered a distinct language (Maria Polinsky, Bernard Comrie p.c.). In this talk I will include sociolinguistic details from native speakers of Sagada, specifically how they view their language and its mutual intelligibility with the larger dialects of Tsez. I will also draw attention to phonological, morphological, and lexical similarities and differences between Sagada and Tsez. Second, I will give a brief presentation on the latest developments in language technology for low and zero-resource languages. This is based on my work at XRI and my attendance at the recent conference at the UNESCO headquarters in Paris titled “Language Technology for All.” These latest developments are having a large impact around the world to document and revitalize endangered languages and could be useful for languages in the Caucasus.

14 April

Alexandra Vydrina

Sentence Focus in Kakabe (Alexandra Vydrina)

Abstract

This talk draws attention to the diversity of pragmatic functions of Sentence Focus utterances in natural speech on the example of Kakabe, a Western Mande language. It is often ignored in the literature that SF can play multiple roles in discourse. Presentational ‘out-of-the blue’ utterances answering the questions ‘What happened?’ or ‘What’s new?’ are often considered as their main or even their only type of use. Yet the analysis of natural texts shows that SF utterances are at least as frequently used with the so-called explicative function (Sasse 1987; 1996; Matras and Sasse 1995) and the even lesser known inferential function, studied by Declerck (1992), Delahunty (1995; 2001) and Bearth 1992; 1997; 1999b). In particular, I will highlight the intersubjectivity aspect of speech production that is crucial in the understanding of how Inferential SF utterances are used. I will show on the example of Kakabe, a Western Mande language, that when natural speech is considered, apart from introducing all-new events, SF utterances turn out to be associated with a rich array of discourse strategies, such as explicative, elaborative, disruptive functions, etc. Accordingly, the discourse properties of the referents inside SF are subject to variation, and crucially, they affect the implementation of the focus-marking.

7 April

Tatiana Philippova, Anastasia Panova

Preposition drop and language contact: The case of Daghestanian Russian (Tatiana Philippova, Anastasia Panova)

Abstract

This paper studies the phenomenon of preposition drop — cases where preposition does not appear when we expect it to — in particular, in locative, directional and temporal adverbial phrases. We review and classify the existing analyses of the phenomenon that were proposed for different languages, predominantly non-standard and contact varieties. Next, we proceed to our quantitative study of preposition drop in Russian spoken in Daghestan, based on data collected from the sociolinguistic interviews of the DagRus corpus. We show how preposition drop depends on various linguistic and sociolinguistic factors, employing statistical methods. Level of Russian, preposition type and phonetic context turn out to be good predictors for preposition drop. We propose a functional explanation for the observed pattern.

31 March

Anastasia Yakovleva

Greek diglossia: a case study of spatial marking in Katharevousa (Anastasia Yakovleva)

Abstract

According to Ferguson (1959), in a diglossic situation two distinct varieties of a language (‘high’, learned by formal education, and ‘low’ colloquial) are spoken in the same community. He claims that High variety always exists in a stable codified form, whereas Low demonstrates wide variation in grammar and vocabulary. Although this is the case for some diglossic societies (such as Tamil), for others the situation is the opposite . My corpus analysis of Katharevousa (official language of Greece till 1976) demonstrates the instability of this register in the domain of spatial relations.

24 March

Samira Verhees

What is a quotative evidential, and does it exist? (Samira Verhees)

Abstract

Reported speech markers constitute a substantial part of evidentiality’s semantic domain. At the same time, the internal division of this subdomain into specific values remains disputed. Aikhenvald (2004) proposed an important distinction of reportative and quotative markers: reportatives refer to information based on hearsay, while quotatives refer to information based on the verbal report of a particular source. Several authors have since argued that quotatives (in contrast with reportatives) are not proper evidentials (among them are Boye (2010) and Holvoet (2018)), because they designate a proposition, rather than specify the speaker’s information source. In this talk I will discuss the typology of reported speech evidentials and compare the properties of “quotative” markers from a variety of languages to determine whether they can be viewed as evidentials.

17 March

Polina Nasledskova

Denominal postposition in East Caucasian languages (Polina Nasledskova)

Abstract

This is a study of grammaticalization sources of postpositions across East Caucasian languages. The focus is on the postpositions grammaticalized from nouns denoting body parts. While these nouns cross-linguistically often grammaticalize into spatial markers, in particular adpositions, this path does not seem to be typical of (all) East Caucasian. Postpositions from body parts are not equally spread across the family. Some languages have many, some few, and some none at all. The goal of this study is to provide an account for their distribution in genealogical and areal terms.

Timofey Mukhin

Anaphora and spatial deixis in East Caucasian: an overview of the data

Abstract

Demonstratives, in addition to the main deictic uses, cross-linguistically often acquire a number of other functions, including anaphora. The majority of East Caucasian have rich inventories of demonstrative pronouns that employ three or more different stems covering various dimensions of deixis (distance, altitude and other). Most languages of Dagestan do not have a special 3rd person nominal pronoun and use attributive demonstratives instead. The main goal of my study is to examine what governs the choice of the stem to be used anaphorically (for example only proximal one) or, if all stems of an inventory are used in this function, how are they distributed in this function by frequency. My study is based on grammars and samples of narrative texts. In this presentation, only data from Lezgic languages will be discussed.

10 March

Peter Arkadiev

Non-canonical inverse in Circassian and Abaza: borrowing of morphological complexity (Peter Arkadiev)

Abstract

In this paper I discuss a typologically peculiar inverse-like construction found in the polysynthetic ergative Circassian languages of the Northwest-Caucasian family and will argue that this construction has been borrowed into Abaza belonging to a different branch of the same family. These languages possess a cislocative verbal prefix, which, in addition to marking the spatial meaning of speaker-orientation, systematically occurs in polyvalent verbs when the object outranks the subject on the person hierarchy. The inverse-like use of the cislocative in Circassian differs from the “canonical” direct-inverse system in that, first, it is fully redundant since the person-role linking is achieved by means of the person markers themselves and, second, it does not occur in the basic transitive construction, featuring instead in configurations involving an indirect object both in ditransitive and bivalent intransitive verbs. I argue that the similar use of the cislocative prefix observed in Abaza is a result of pattern-borrowing from Kabardian, with which Abaza has been in intense contact, and that this borrowing has resulted in the increase of both paradigmatic and syntagmatic complexity of Abaza verbal morphology.

3 March

Evgeniya Budennaya

Non-pro-drop in the Baltic Area: for and against contact-induced origin (Evgeniya Budennaya)

Abstract

Five geographically close languages to the east of the Baltic sea – Russian (East Slavic), Latvian (Baltic), Ingrian, Votic and Ingrian Finnish (Finnic) – use a similar pattern for marking subject reference. In this pattern both personal pronouns and subject agreement on the verb are employed (from ⅔ to ¾ of all occurrences). This happens with all types of personhood: (1) Russian: Ja id-u domoj I.NOM go.PRS-1SG home ‘I go home’ (2) Latvian: par k-o t-u domā-ø ? - about what-ACC 2SG-NOM think.PRS-2SG ‘What are you thinking about?’ (3) Ingrian: hǟ kūl-i-ø 3SG.NOM die-PST-3SG ‘She is dead’ However, this double-marking pattern is extremely uncommon over the world where most languages are either pro-drop with verbal inflection (61%, WALS) or non-pro-drop without any additional verbal inflection (Siewierska 2004). Taken together the geographical proximity of the languages under discussion and the typological rarity of the referential pattern itself, one can treat it as an areal feature which could not arise independently (Kibrik 2013). The talk will trace this feature diachronically and discuss the results for Russian, Latvian and minor Finnic. Special attention will be given to the controversy of whether we deal with a similar contact-induced change in Latvian and minor Finnic or with two different processes that eventually converged into an apparently similar pattern.

25 February

Chiara Naccarato, Samira Verhees, Timofey Mukhin, Rita Popova, Lev Kazakevich, Konstantin Filatov

Typological atlas of Daghestan: state of affairs and future plans (Chiara Naccarato, Samira Verhees, Timofey Mukhin, Rita Popova, Lev Kazakevich, Konstantin Filatov)

Abstract

The typological atlas of Daghestan will be a WALS style resource containing information about linguistic features in the languages of Daghestan. Data for this resource are retrieved from grammars and organized into databases which are then used to generate maps. The final product will be a tool for the visualization of information about linguistic structures characteristic of Daghestan, but also a useful resource for bibliographical research on parameters of interest. In this talk we will discuss the state of affairs of the project and our future plans. We will briefly present the preliminary results related to the features that are currently being developed, and we will discuss some technical issues concerning the design of introductory texts and the generation of maps.

18 February

Maria Morozova, Maria Ovsjannikova, Alexander Rusakov

A dialectometric study of Albanian varieties: linguistic complexity and language contact history (Maria Morozova, Maria Ovsjannikova, Alexander Rusakov)

Abstract

The goal of our study is to examine the Albanian dialect continuum using the quantitative methods of dialectometry and interpret the results in terms of the history of the Albanian dialect landscape, in particular its contact history. Our data come from the Dialectological Atlas of Albanian Language that maps phonological, morphological and lexical features of 131 Albanian varieties of the main dialectal area. Using distance calculation, MDS analysis and hierarchical clustering, we estimate and visualize the closeness of these varieties and analyse it against their geographical distribution and the traditional classification of Albanian dialects. The main focus of our talk will be on the notion of linguistic complexity as applied to the Albanian dialect continuum. We identify 27 phonological and morphological parameters as binary complexity/simplicity features, examine their realization in the varieties under study and assess the relation between intensity of contact and linguistic complexity. Then we will briefly discuss the distribution of the varieties in terms of 212 lexical features to compare their grammatical and lexical closeness.

11 February

Natalya Stoynova, Irina Khomchenkova

Contact-influenced Russian of Northern Siberia and the Russian Far East (Natalya Stoynova, Irina Khomchenkova)

Abstract

We will present a new small corpus of contact-influenced Russian speech, namely the Corpus of Russian spoken in Northern Siberia and the Russian Far East, and several case studies on contact phenomena in grammar, based on its data. The corpus consists of short oral texts, mostly narratives, collected as a “by-product” of language documentation projects (some of them are Russian versions of texts in the corresponding indigenous language). The current size of the corpus is ca. 34 hours (78452 tokens). The majority of texts come from two regions - the Taimyr peninsula (the subcorpus of Samoyedic Russian) and Khabarovsk Krai (the subcorpus of Tungusic Russian). The most important feature of the corpus is the manual annotation of grammatical and lexical contact-induced features. It will be discussed in detail with a focus on problematic cases. To illustrate the range of problems that can be investigated on the data of this corpus, we will also present several case studies. First, we will try to trace a post-pidgin continuum, attested among the Nganasans, on the data of non-standard gender agreement patterns (cf. бабка помер ‘old woman die.pst.masc’). Second, we will consider a problem of identifying morpho-syntactic calques on the example of non-standard numeral constructions, attested in Tungusic Russian (пять дом ‘five house.nom.sg’, двое сыновья ‘two son.nom.pl’). Third, we will illustrate the problem of differentiation between contact-induced and dialectal features on the example of pluperfect be-constructions (умер было ‘(he) died be.pst’) and non-standard coordination patterns with the particle da (офицер=да, майор=да ‘officers=ptcl majors=ptcl’), attested in the speech of the Nanais. Finally, we will discuss some general quantitative data on contact-induced grammatical features attested in the corpus, namely the general frequency distribution of different types of these features and individual profiles, compiled for several speakers.

4 February

Ezequiel Koile, Konstantin Filatov, Michael Daniel

Bayesian phylogenetic analysis and wordlist handling (Ezequiel Koile, Konstantin Filatov, Michael Daniel)

Abstract

In this talk, present an introduction to modern Bayesian phylogenetic analysis in historical linguistics. Algorithms will be discussed with special focus in its conceptual motivations, as well as its scope and limitations. Wordlist building and handling will be approached from a practical perspective, including recommendations and examples of implementation. Cross-linguistic online resources and edition tools for this task will be presented. At the end of the seminar, in a separate bonus track we will introduce you to the ongoing collection of 100 Swadesh lists in Daghestan. Our approach emphasizes the importance of strict provenance of the list (village of collection) and the importance of the protocol for data collection (contextualization of the lexical items). If we have time, we will show the preliminary results of the project - the tree of the Andic branch as based on our data.

28 January

Ilya Yakubovich, Russian Academy of Sciences / University of Marburg

Correlates of Language Shift in Population Groups vs. Epigraphic Cultures (Ilya Yakubovich, Russian Academy of Sciences / University of Marburg)

Abstract

The scholars focusing on sociolinguistic situations in ancient societies have no direct access to information about the boundaries of language communities at given points in time but have to study them through the prism of the available written sources. Since the notion of language shift is equally applicable to population groups and epigraphic cultures, and since it can be accompanied by contact-induced changes in both cases, it is appropriate to ask a question how the difference between the two types of communities correlates with different manifestations of language contact. Both corpora of spoken languages and archaic written texts commonly feature results of lexical transfer (borrowings) as well as structural interference. They seem, however, to exhibit inverse chronological correlation between the two types of contact-induced changes and the moment of language shift. When language contact is observed among population groups, lexical expansion frequently predates the shift to a new native language, while structural changes continue to reflect substrate influence after the act of language shift has taken place. In contrast, in the instance of archaic epigraphic cultures (e.g. hieroglyphs or the cuneiform), late or peripheral texts in a particular language frequently reveal deviations in grammatical structure, while written language shift in a scribal community may be accompanied by the retention of graphic loanwords (heterograms), which represent the language of tradition. In my talk I intend to address the reasons for such an asymmetry, and to discuss the peculiarities of some contact situations that do not fall under the proposed generalization.

21 January

Nina Dobrushina, George Moroz

The speakers of minority languages are more multilingual (Nina Dobrushina, George Moroz)

Abstract

Population size is often discussed as a factor which might have influenced patterns of language and cultural evolution (Bowern 2010; Donohue & Nichols 2011; Nettle 2012; Bromham et al. 2015; Greenhill et al. 2018; Koplenig 2019; see Greenhill 2014 for an overview). In this paper, we advance the hypothesis that the larger is the population of language speakers, the less is the number of L2 mastered by these speakers. The correlation between the size of language population and the level of multilingualism of its speakers is tested statistically on a large body of empirical data from Dagestan. Due to the digitalization of the Census of 1926, we have at our disposal reliable information about the population of Dagestanian villages before the urbanization and the rise of population caused by the First Demographic Transition. These data allow us to make credible estimations of the number of speakers of all languages and dialects of Dagestan before the changes of 20th century. The second type of data comes from a field study of Dagestanian multilingualism. Language repertoires of 4,032 people were collected and coded by using the method of retrospective family interviews during field research in 2011 - 2019. The data on multilingual repertoires covers only a small part of the villages for which we have population sizes, namely 54 villages speaking 29 different lects. We match population sizes of these 29 languages and the number of L2s spoken by the speakers of these languages from 54 villages, and run a Poisson mixed effects regression model that predicts the average number of second languages spoken by speakers from L1 communities of different size. The study confirms the hypothesis that the size of language population is negatively correlated with the multilingualism of the language community.

14 January

Aigul Zakirova

The particle =OK in the Volga-Kama languages: contexts of use and frequencies of lexicalization (Aigul Zakirova)

Abstract

In this talk I will focus on the emphatic identity particle in the Volga-Kama languages (Chuvash =aχ/ =eχ, Tatar and Bashkir =uk/ =ük, Meadow Mari =ak, Hill Mari =ok and Udmurt =ik). The particle was borrowed from the Turkic (Bulgar) lects to the Finno-Ugric lects of the Sprachbund. Its contexts of use overlap to a fair extent in different languages but do not fully coincide. One of the core functions is that =OK is used when an argument of a proposition is identical to an argument of a different proposition (e.g. that house=OK ‘the same house’, referring to a previously mentioned house). In this talk I will take a closer look at this and other contexts of use of =OK in the languages of the area. Besides, =OK tends to attach and become lexicalized with some items, mostly adverbial expressions. I will present data on the frequencies of such collocations with =OK in the Volga-Kama Sprachbund.

Seminar schedule 2019

24 December

Konstantin Kazenin

Ethnicity, speaking indigenous languages and fertility in the North Caucasus (Konstantin Kazenin)

Abstract

Regions of the North Caucasus have experienced considerable social changes within the recent 4-5 decades, which included intensive urbanization, loosening of traditional family norms, weakening of gender asymmetries and lowering of empowerment of elder generations in communities and families. These processes started at different time and went with different speed in the republics of the North Caucasus, but in all the republics they were accompanied by a significant decrease of fertility, known as the First Demographic Transition (FDT) in population studies. That decrease was not at all unexpected, as ‘detraditionalization’ changes of the kind recently observed in the North Caucasus and the First Demographic Transition took place nearly simultaneously in many parts of the world. However, a number of cultural characteristics of the North Caucasus allows to address some issues concerning the First Demographic Transition which are difficult to consider elsewhere. First, at the start of the social and demographic changes women of different ethnicities differed in their fertility levels. As urbanization made interethnic contacts more intensive, the question arises whether these differences were eliminated in the process of the fertility decrease. Second, in urban population of the North Caucasus, including younger generations, the proportion of speaking indigenous languages is not negligible. Therefore, it is possible to consider the hypothesis that in families where an indigenous language is spoken in everyday life fertility will not decrease so radically as in families where Russian is the only spoken language of the parents. If this expectation is borne out, we get an interesting case where linguistic and reproductive behavior are related. The talk will start with introducing the concept of the First Demographic Transition. Then we present the key demographic changes observed in the North Caucasus between the 1970s and the 2010s. After that, using data from the Republic of Daghestan (Russian Census 2010 and a sample survey of 2018), we demonstrate that ethnic differences in fertility were preserved to different extent in different educational groups of urban population. Also, using data from the Republics of Ingushetia (a sample survey of 2019), we show that parents’ speaking an indigenous language really has a positive significant relation to fertility.

17 December

Paul Phelan

The periphrastic causative in West Circassian (Paul Phelan)

Abstract

West Circassian, along with the other languages of the Northwest Caucasian family, is a highly polysynthetic language with complex verbal morphology. One marker in particular, the causative marker ʁe- , is highly productive. The morphological causative is commonly used to derive transitive predicates from nontransitive verbs and nominals. It is also, clearly, an important instrument for expressing the semantics of causation and in the associated valency increasing operation. In some cases ʁe- has even calcified on predicates, so that the predicate has no meaning without the causative prefix. It is therefore rather remarkable that a periphrastic causative strategy has arisen in West Circassian. This construction is based on the matrix verb ṣ̂ə- ‘to do, make’, which pairs with a lexical verb with the purposive suffixation. Based on observations of the behavior of personal indexes on the matrix verb it is apparent that this structure is noncompositional and has grammaticalized with causative semantics which are the same as those of the morphological causative.

10 December

Anastasia Panova, Tatiana Philippova

A detailed corpus study of preposition drop in DagRus: preliminary results (Anastasia Panova, Tatiana Philippova)

Abstract

In this talk we will discuss the phenomenon of preposition drop (omission) that is observed in the speech of L2 speakers of Russian with a Nakh-Daghestanian or Turkic (Kumyk, Azerbaijani) first language. We shall first review insights on the issue from the existing literature on the topic (Daniel & Dobrushina 2009, Daniel, Dobrushina & Knyazev 2010, Daniel & Dobrushina 2013 for Russian spoken in Daghestan; Stoynova and Shluinsky 2010 for Russian spoken by the Enets people; Khomchenkova, Pleshak and Stoynova 2017 for Russian of Northern Siberia and Russian Far East; Shagal 2016 for Russian spoken by the Erzya people), then present the data on preposition drop in the Russian speech of Kumyk and Azerbaijani native speakers, and suggest working hypotheses about the theoretical interpretation of the phenomenon that would guide our further work.

3 December

Polina Nasledskova

Causative alternations database (Polina Nasledskova)

Abstract

Causative alternations database There are several kinds of correspondence between causative and non-causative verbs: one can be derived from the other, they can be suppletive etc. These correspondences differ not only from language to language, but also from one causative pair to another within one language. The database was created so that one could deal with exact numbers of these alternations in many languages. I am going to talk about the data, the structure of the database and its challenges.

Ezequeil Koile

The Database of Cross-Linguistic Colexifications CLICS³: Data-driven semantic research from a cross-linguistic perspective

Abstract

The term colexification (François 2008) refers to instances where the same word expresses two or more comparable concepts, covering instances of polysemy, vagueness, and homonymy. The comparative study of colexifications across languages allows for the construction of semantic maps, a useful tool for the study of lexical typology and beyond, ranging from studies on semantic change, patterns of conceptualization, and linguistic paleontology. In this talk, we describe the recently released third version of the Database of Cross-Linguistic Colexifications, CLICS³ https://clics.clld.org/ (Rzymski, Tresoldi, et al. 2019), a computer-assisted framework for the interactive representation of cross-linguistic colexification patterns, containing data for 2811 concepts across 2955 languages.

26 November

Nina Dobrushina

Challenges of variation (Nina Dobrushina)

Abstract

Variation is inherent to all languages. Ɨt seems, however, that the degree of variation can vary from language to language. It is sometimes claimed that languages with writing systems show more variation than unwritten languages. It was also argued that small languages have less variation than large languages with many L2 speakers. It seems, however, that none of these conjectures were ever empirically tested. In fact, to date we have no methods which would allow measuring and comparing the amount of variation between languages. In this talk I want to raise this problem rather than suggest a solution.

George Moroz, Samira Verhees

Catching variation during fieldwork on Nakh-Daghestanian languages

Abstract

During fieldwork researchers have to deal with all kinds of variation in the answers given by speakers: free variation, idolectal or sociolinguistic variation. In the present investigation we studied the degree of variation among 44 speakers of Zilo Andi for 16 different morpho(no)logical features known to be variable in this dialect. Additionally, we conducted a survey among a number of researchers of Nakh-Daghestanian languages, asking them about their fieldwork habits - including questions about how many speakers they usually consult. We used these data to evaluate the probability that an average researcher of Nakh-Daghestanian languages catches the observed variation during fieldwork.

19 November

Olesya Khanina, Andrey Shluinsky, Yury Koryakov

Enets in space and time: a study in linguistic geography and history (Olesya Khanina, Andrey Shluinsky, Yury Koryakov)

Abstract

This paper summarises a joint study by Yuri Koryakov, Andrey Shluinsky, and myself, see (Khanina et al. 2018a, Khanina et al. 2018b). Through a series of linguistic maps based on published ethnographic data and our fieldwork accounts, we reconstruct the territories in which Forest Enets and Tundra Enets (Samoyedic, Uralic; Central Siberia) have been spoken from the 17th century till today. We analyze in details migrations of the two ethnic groups and the changing language contact scenarios. One of the most intriguing findings of this study is an explanation of the Forest Enets - Tundra Enets puzzle. There is no unanimity whether they are separate languages or dialects of the same language. Ethnographically, the two linguistic communities are clearly distinct, have different self-nominations, and do not consider themselves as belonging to the same ethnic group. As our field experience has shown, the degree of modern mutual comprehension is not a neutral question and depends on the stance that a speaker takes at the moment of conversation, whether stressing the difference between the two ethnic groups or aiming at reaching his/her communicative goal. Whereas the phonologies of Forest Enets and Tundra Enets suggest a split of at least several hundred years ago, and lexicostatistical calculations go even further by dating the split ca. one thousand years ago, the match between the two Enets grammars is so striking that it contradicts this scenario. So here the linguistic geography steps in, documenting migrations that led to a secondary convergence in the 19th century of the once more distinct Enets lects, which was later, in the beginning of the 20th century, followed by a secondary divergence. We support this historic hypothesis with a catalogue of all features that separate the two Enets varieties and with linguistic maps reconstructing changes in the territories of the two ethnic groups in the last 300 years. References Khanina, Olesya, Koryakov, Yuri & Andrey Shluinsky. 2018a. Enets in space and time: a case study in linguistic geography. Finnisch-Ugrische Mitteilungen 42, 109-135. Khanina, Olesya, Shluinsky, Andrey & Yuri Koryakov. 2018b. Forest Enets and Tundra Enets: how similar/different are they and why? Paper presented at the 7th international conference on samoyedology, October 2018 (Tartu, Estonia).

12 November

Alexandra Vydrina

Fouta-Djallon multilingualism (Alexandra Vydrina)

Abstract

Western Africa occupies a central place in the research on multilingualism due to the studies on the sociolinguistic situation in Cameroon (di Carlo 2018) and the Casamance area in Senegal (Lüpke and Storch 2013; Lüpke 2016). The study focuses on yet another case of multiligualism in Western Africa by discussing the multilingualism patterns in the area of Fouta-Djallon plateau in Guinea. The situation will be analyzed in the perspective of communities speaking Kakabe, a minor language spoken in about fifty villages. The involved languages are Kakabe, Maninka, Pular, and, to a lesser extent, Sussu, with Pular belonging to the Atlantic family and the three other languages to the Mande family. In my talk, I will analyze the attested multilingualism patterns in different types of language practices. The study is based on a multi-media oral corpus representative of a variety of genres and containing data that I have been collecting in the region since 2009.

29 October

Natalya Serdobolskaya

Morphosyntax of complement clauses in East Caucasian languages: long-distance agreement (Natalya Serdobolskaya)

Abstract

The East-Caucasian languages (Nakh-Daghestanian) show a number of puzzling structures that are challenging from the theoretical point of view: non-finite clauses where all the arguments are encoded in the same way as in independent sentences, backward control, long-distance reflexive pronouns and long-distance agreement in complement clauses. This talk is focused on long-distance agreement in East Caucasian languages. First, I discuss the phenomenon in Qunqi Dargwa. The infinitives and converbs are the only complementation strategies that allow long-distance agreement. In Qunqi, there is a fuzzy boundary between infinitives and indirect mood forms. The converbs are used both with control verbs and with emotive and perception complement-taking verbs. The long-distance agreement pattern is only observed with control verbs. I show that these structures show properties of clause union. Then I consider the data of 19 East-Caucasian languages (mostly based on the data from Kibrik 2005 “Materials to the typology of ergativity”), and discuss the long-distance agreement patterns in those languages. In most of these languages this phenomenon is limited to control constructions, while Tsez and Tsakhur deviate from this generalization.

22 October

Albert Davletshin

Subgroups, linkages and beyond: Working on shared innovations in Eastern Polynesian languages (Albert Davletshin)

Abstract

Polynesia covers a vast territory of the planet. It includes a large number of speech communities which descend from a common ancestor; some them are isolated by thousands of miles of open ocean. No wonder, Polynesia has always been the favorite place for both linguists and anthropologists working on phylogenetics. The standard account of the Eastern Polynesian subgrouping is that the language of Easter Island (Rapanui) forms a branch on its own, coordinated with Central Polynesian languages; CE in turns branches into Tahitic and Marquesic (Roger 1985). It has been becoming more and more evident that Tahitic and Marquesic are not valid subgroups (Vladimir Belikov 2009, Mary Walworth 2014). In my talk, I am going to show that Rapanui, Mangarevan, North and South Marquesan constitute a subgroup within Eastern Polynesian languages. Interestingly enough, this proposal implies some phonological and lexical innovations spreading across the Pacific. The main objective of the talk is to discuss the latter and their implications for the theory of language.

15 October

Alexander Shiryaev, Michael Daniel, George Moroz

Glottalized /lˀ/ in Rikvani Andi (Alexander Shiryaev, Michael Daniel, George Moroz)

Abstract

The opposition of the geminate and singleton ejective lateral stop /L’/, reconstructed to proto-Andic, has been lost in various Andic languages due to the phonetic evolution of the simpleton into a different soundtype. The speakers of Rikvani Andi, a dialect of Andi (Andic, East Caucasian) spoken by 800 people in the village of Rikvani in Dagestan, developed a glottalized lateral consonant as a reflex. Glottalized sonorants are a typological rara. While they have been sparsely attested in various areas where glottalic initiation also occurs with stops (ejectives), Rikvani Andi is the only variety of a Caucasian language where it has so far been reported. The present study is an acoustic analysis of the sound as documented in the field data collected by the HSE team in the village. Data from six speakers’s word list elicitation as well as from spontaneous texts from two speakers are used. This study is the first that considers glottalized sonorants in connected free speech. We test several generalizations and observations previously made wrt glottalized sonorants (Um 2001, Maddieson and Larson 2002, Maddieson et al. 2009). We confirm presence of strong variation of the realization of the glottliazed /lˀ/ across speakers in terms of presence of creaky voice, focus of glottalization, intensiity and formant structure. We do not confirm Um’s suggestion that the peak of glottal construction (pre- vs. post-glottalized realization) is affected by the poisition of the sound in syllable/word. We conclude that the distinctive feature is the presence of creaky voice and intensity, but these features become less pronounced in free speech. Finding the acoustic cue that serves to distinguish glottalized sonorant requires further acoustic research. Variation across speakers and fading away of the cues in connected speech may suggest that a merger of /lˀ/ and /l/ is on its way. Various visualization techniques highlighting the nature of /lˀ/ will be discussed.

1 October

Chiara Naccarato, Samira Verhees

Fieldtrip to Botlikh (Daghestan) (Chiara Naccarato, Samira Verhees)

Abstract

In August of 2019 we visited the village of Botlikh to study the Botlikh language. Our aim was to collect some data for a small investigation on agreement patterns of ordinal numerals. In addition, we translated several texts recorded by Togo Gudava in the 1950s-1960s, met a number of potential language consultants and learned some new things about the sociolinguistic situation in Botlikh. We will talk about the trip and our future plans for working on this language. The Third School on Statistical Methods for Linguistics and Psychology (University of Potsdam, Germany) (George Moroz, Olga Lyashevskaya)

24 September

Ezequiel Koile

Phylogeography of the Bantu Expansion(Ezequiel Koile)

Abstract

Bantu expansion is among the most important and least understood human migrations. Bantu-speaking populations (240 million people, 500 languages, spanning 9 million km2 [1]) are the result of a huge migration originating in a homeland near the border of Nigeria and Cameroon between 4,000BP and 5,000BP [2,3,4,5,6,7,8]. Although the homeland and the time depth are well established, the migration route is still unclear. Recent phylogenetic studies [1,9,6,7,8] support the late-split [10,11,12,13,14], which claims that East-Bantu and West-Bantu languages’ common ancestor crossed the African Rainforest, splitting after this. It is thought that this crossing was made through the Sangha River Interval (SRI), a N-S savanna opening into the rainforest. However, in dated phylogenies [7], dates don’t match consistently: They should have crossed this corridor around 4,000BP, while it was completely open only 2,500BP. We propose two different hypotheses for competing with the traditional SRI late-split. The first, a coastal savanna corridor [15]. The second, an earlier paath through the rainforest. We compare the hypotheses with a Bayesian phylogeographic approach based on linguistic trees. We use lexical and geographical data for 400+ Bantu and Bantoid languages, inferring the linguistic and geographic history in parallel, by implementing the break-away model [16] in BEAST2 [17]. We conclude that the way through the rainforest happened around 4,000BP.

Chiara Naccarato, Natalya Stoynova, Anastasia Panova

Contact-influenced word order in genitive noun phrases: A corpus-based investigation of Russian spoken in Daghestan

Abstract

The paper deals with non-standard word order in the variety of Russian spoken by bilinguals from Daghestan. Specifically, we focus on the occurrence of prepositive genitive modifiers in bilinguals’ speech. Whereas in monolinguals’ Russian the neutral and most frequent word order in noun phrases with a genitive modifier is the order N+GEN, in Daghestanian Russian the opposite order GEN+N often occurs. This phenomenon was mentioned as one of the striking morphosyntactic features of Daghestanian Russian, and its frequent occurrence can be partly explained in terms of syntactic calquing from speakers’ L1s, all featuring an unmarked GEN+N order in noun phrases. However, the picture is far less trivial than it could look at first sight. On the one hand, the word-order pattern GEN+N does not seem to affect equally all types of genitive noun phrases in Daghestanian Russian. On the other hand, similar examples of non-standard word order are sometimes found in monolinguals’ speech too. In the course of the paper, we present the results of our corpus-based investigation of genitive noun phrases in Daghestanian Russian as compared to monolinguals’ spoken Russian, including dialectal varieties. Prepositive genitives appear to be favored by several lexico-semantic and processing features of both the head and the genitive dependent. The strongest factor is kinship semantics: noun phrases that express a kinship relation tend to be prepositive. In monolinguals’ spoken Russian, although prepositive genitives are very infrequent, they sometimes show similar lexico-semantic and processing features. Therefore, we are not dealing with a simple calquing process. Rather, L1 influence is manifested in the strengthening of some tendencies existing in monolinguals’ Russian too.

17 September

Michael Daniel, Ilya Chechuro, Samira Verhees , Nina Dobrushina

Lingua francas as lexical donors: quantitative field study(Michael Daniel, Ilya Chechuro, Samira Verhees , Nina Dobrushina)

Abstract

The paper investigates the role that the rate of bilingualism plays in lexical borrowing. Our data comes from Daghestan, an area of high language density. Based on loanword counts, we isolate two zones of lexical influence, the south, heavily influenced by Azerbaijani, and the north, dominated by Avar. This salience of Avar and Azerbaijani as donor languages is likely to reflect the historical role of these languages as lingua francas in their respective geographical zones. The study supports the idea of Brown (1996, 2011) that contact influence from a lingua franca is higher than from a language only used to communicate with its L1 speakers. In line with the widespread argument that the amount of contact-induced change from a language is proportional to intensity of bilingualism (Thomason & Kaufman 1988), Brown stipulates that the importance of lingua francas as lexical donors must be linked to the high rate of bilingualism in these languages. The bilingualism in Azerbaijani and Avar was indeed high, as the evidence from field research on traditional language repertoires of Daghestanian highlanders shows. On the other hand, the knowledge of two other locally important languages, Chechen and Georgian, which was, at some locations, only slightly lower, did not lead to the same level of lexical transfer; in fact, the amount of Georgian and Chechen borrowings seems disproportionately low. High bilingualism rates are thus not sufficient for a language to become a major lexical donor. At the level of methodology, the paper explores the prospects of using short wordlists as ‘contact probes’, tools for measuring lexical contact. We follow the approach by Haspelmath & Tadmor (2009) and Bowern et al. (2011) in applying a fixed list of concept to quantify lexical contact between languages. Based on field elicitations conducted in a number of villages in the Republic of Daghestan, a list of 160 concepts is shown to be efficient enough to differentiate the degrees of lexical impact from the locally important L2’s to minority languages. The method does not only ensure comparability across contact situations but also provides a level of resolution that is sensitive to differences between villages speaking the same language. By fine-tuning the wordlist to a different linguistic setting, the methodology suggested here may be extended to other geographical areas of intense language contact and become a tool for reconstructing multilingual patterns of the past.

11 June

Chiara Naccarato, Natalya Stoynova, Anastasia Panova

Contact-influenced word order in genitive noun phrases: A corpus-based investigation of Russian spoken in Daghestan(Chiara Naccarato, Natalya Stoynova, Anastasia Panova)

Abstract

In a recent paper (Naccarato, Panova & Stoynova Forth.), we have examined cases of non-standard word order in the variety of Russian spoken by bilinguals from Daghestan. Specifically, we have restricted our analysis to the noun phrase, and have looked at the occurrence of prepositive genitive modifiers in bilinguals’ speech. As we have shown, whereas in Standard Russian the neutral and most frequent word order in noun phrases with a genitive modifier is the order N+GEN (muž sestry), in Daghestanian Russian the opposite order GEN+N (sestry muž) often occurs. This phenomenon has been partly explained in terms of syntactic calquing from speakers’ L1s, all featuring a neutral GEN+N order in noun phrases. However, such inversion in word order does not seem to equally affect all types of genitive noun phrases in Daghestanian Russian, but appears to correlate significantly with noun phrases featuring kinship semantics. Moreover, similar examples of non-standard word order are sometimes found in monolinguals’ speech too, which makes the picture far less trivial than it could look at first sight. In this talk, we present the latest results of our corpus-based investigation of genitive noun phrases in Daghestanian Russian as compared to monolinguals’ spoken varieties of Russian, with the aim of explaining the factors boosting non-standard word-order realizations.

4 June

Samira Verhees, Chiara Naccarato

Animacy agreement in Botlikh: ordinal numeral (Samira Verhees, Chiara Naccarato)

Abstract

Botlikh (Avar-Andic, East Caucasian) features a two-fold animacy agreement system including, on the one hand, a set of noun class (i.e. gender) markers representative of many EC languages and, on the other hand, an additional set of dedicated animacy markers which are unique to Botlikh. The dedicated animacy markers can appear on various targets (i.e. negative copulas, interrogative particles, question word formants, attributive clitics, present/future participles, ordinal numerals), and agreement is controlled by either the nominal head or the absolutive argument of the verb. By focusing on ordinal numerals, which appear to mark animacy most consistently, we set the following goals: a) to better understand the agreement patterns of these forms; b) to clarify which referents qualify as animates and which do not. For these purposes, we have created the first draft of a survey which we will discuss during the talk.

26 February

Anastasia Panova

The scope of refactive markers in Abaza (Anastasia Panova)

Abstract

Most descriptions of Abaza mention two affixes which express the meaning of refactive (‘again’, ‘once more’, etc.): the suffix -χ and the prefix ata-, which almost always appears only in combination with -χ. I argue that the main difference between these two refactive markers is that the marker -χ “sees” the internal structure of an event and can have scope over any part of it (just the resultant state, or just the process, with or without arguments), while the marker ata-+-χ is “blind” to the internal structure of the situation and can only “copy” the whole event with its arguments.

12 February

Samira Verhees, Ilya Chechuro

Noun vs. verb inflectional synthesis: A complexity trade-off?(Elena Sokur, Johanna Nichols) A database for loanwords in Daghestan (Samira Verhees, Ilya Chechuro)

Abstract

In this talk we introduce our first pilot database for the DagLoans project. The database contains translations of 160 concepts collected in the field in Daghestan (and Northern Azerbaijan). At present, this includes a total of 24.785 entries from 23 different languages. The database can be used to find the translation of a concept in one or more languages. The most important feature is “Set”: all entries are grouped in sets with other similar words, which allows us to plot the spread of lexical items on the map. The database can be used for conducting quantitative research on lexical convergence as well as for creating geographical maps showing the areas and the intensity of foreign influence.

5 February

George Moroz

Bayes Factor: Bayesian way without diving in Bayesian maze (George Moroz)

Abstract

The most common statistical task is hypothesis testing. When a pair of competing models is fully defined, their definition immediately leads to a measure of how strongly each model supports the data. The ratio of their support is often called the likelihood ratio or the Bayes factor. During the talk I will show how to define different models and compare them with Bayes factor. Typological atlas of multilingualism in Daghestan: problems and perspectives (Konstantin Filatov)

29 January

Chiara Naccarato

The u+gen construction in Modern Standard Russian (Chiara Naccarato)

Abstract

In Modern Standard Russian, the prefix/preposition pair u-/u is peculiar with respect to other similar pairs, due to the meaning mismatch between the two. While the prefix u- has an ablative meaning, as shown when it is prefixed to motion verbs, the prepositional phrase u+gen occurs in locative constructions, and other related constructions, such as predicative possession that is expressed via the cross-linguistically common Locative Schema. Etymological considerations show that the meaning preserved by the prefix is older. The only type of occurrence which, according to the literature, preserves the ablative meaning for the u+gen construction preposition is found with verbs of requesting, removing, and buying. Notably, however, in other Slavic languages putative ablative contexts are limited to verbs of requesting. Data from MSR, OCS, Polish and Czech lead to the conclusion that the extension of the u+gen construction to verbs of removing in MSR is based on its use for the encoding of predicative possession. Extension to verbs of buying is better explained through the locative meaning of the construction. As a result of different developments, the u+gen construction has become part of the argument structure of a group of verbs including verbs of asking and requesting, verbs of removing and verbs of buying, which are characterized by the common feature of taking human non-recipient third arguments.

22 January

Ivan Kapitonov

Kunbarlang (Ivan Kapitonov)

Abstract

Kunbarlang is a critically endangered polysynthetic language spoken in central Arnhem Land, Northern Territory, by approximately 40 people. It belongs to the non-Pama-Nyungan Gunwinyguan family. This talk reports on the first comprehensive description of Kunbarlang (although it builds on and extends important unpublished work by Carolyn Coleman and Joy Kinslow Harris). Kunbarlang has very rich verbal morphology that includes complex agreement paradigms, composite TMA system that differs from other Gunwinyguan languages, an array of argument derivation tools, and coverb constructions. The nominal domain, on the contrary, has little morphology and relies heavily on syntactic constructions - for instance, case marking of nouns is analytical. The talk will give a general overview of the grammar, and then focus on a few selected topics across different areas.

15 January

Ekaterina Schnittke

Sequence of tenses in Russian? Tense choice in complement clauses in Standard and Learner Russian(Ekaterina Schnittke)

Abstract

It is generally believed that Russian has no sequence of tenses (SoT) in complement clauses, and the choice of absolute tense over relative is considered to be a typical error in the interlanguage of non-standard speakers of Russian as a foreign language whose native language features SoT, e.g. English. However, all uses of absolute tense in Learner Russian cannot qualify as errors, since Standard Russian shows a great deal of variation in tense assignment in complement clauses. One of the factors that is said to govern tense choice is the semantics of the matrix verb (Barentsen, 1996; Гиро-Вебер, 1975, Schlenker, 2003, inter alia). Specifically, speech and mental verbs are said to strictly require the relative tense, whereas sensory, emotion, and existential matrix verbs allow for both absolute and relative tense patterns. Despite the acknowledged variation, the precise distributional patterns of tenses in complement clauses have been understudied. This paper is a systematic corpus-based study of the variation in tense choice across the semantic classes of the matrix verbs in two language varieties: (i) Standard Russian as represented in the Russian National Corpus and (ii) Learner Russian of anglophone speakers as represented in the Russian Learner Corpus. I examine those clausal complexes where the matrix verb in the past tense and the verb of the complement clause denote simultaneous actions. The analysis identified a likelihood hierarchy of verbal semantic classes ranging from the least likely to tolerate past tense in the complement clauses to the most likely ones: speech<mental<sensory≈emotion

Seminar schedule 2018

4 December

Timur Maisak, Michael Daniel, Yury Lander

Corpus research of target relativization in several languages of the Caucausus(Timur Maisak, Michael Daniel, Yury Lander )

Abstract

In this talk, we will discuss modifying participial constructions which is a predominant type of relative clauses in East Caucasian languages. One of the key properties of participles in East Caucasian languages is the lack of syntactic orientation. There is little to no syntactic restrictions on what can be relativized: the gap in the relative clause can correspond to a core argument, a peripheral participant or even a participant that is not part of the verb’s argument structure. Languages also share some common patterns of constructionalization of specififc relative construction (such as name-constructions). On the other hand, there is variation across languages, e.g. more or less strongly articulated preference for S relativization; or more or less widespread use of the resumptives; and language-particular features, e.g. a very high ratio of addresse relativization (in name-constructions) in Agul. After a general overview of the problems related to the study of relativization targets, we concentrate on language-particular case studies and discuss the counts of relativization targets in the corpora of two East Caucasian languages (Agul and Archi). As a comparative background, West Circassian corpus data will be presented. In this language relativization is syntactically oriented, the strategy cannot be classified as participial, and special reflexivizers may be interpreted as obligatorified resumptive pronouns. Finally, we discuss the comparison of corpus counts on the relativized syntactic role in the three languages, and the problems connected to such comparison.

13 November

Nina Dobrushina

Conditions and questions: several cases of combined marking in Nakh-Dagestanian languages ( Nina Dobrushina)

Abstract

In this paper, I consider several Lezgic languages suffixes (possibly, but not definitely related) that cover a rather wide range of contexts. Some contexts of their use may be qualified as denoting unrealized state-of-affairs (such as conditional clauses, polar questions, and, to a certain extent, indefinite pronouns). Some others fall short of this definition, including indirect questions and other subordinate clauses with WH-words. The set of contexts covered by the markers in question is one and the same in at least three Lezgic languages (Lezgian, Aghul, Tabassaran), and also in Azerbaijani, which raises a question of possible contact origin of this pattern. Some other Lezgic languages employ, in these contexts, several different markers. Kina dialect of Rutul presents an especially interesting case, combining in one morpheme (-jden) the meanings which are unlikely to be associated. In this talk, I will present the case of Kina Rutul in details, discuss possible interpretation and origins of the marker -jden, and compare Kina Rutul with other Lezgic languages.

6 November

Michael Daniel

Evaluating DP as a measure of corpus heterogeneity. The Even dialect comparison project at crossroads ( Michael Daniel)

Abstract

In this SMALL discussion, we will present the path taken so far for the methods of inter-dialectal comparison, the point we are currently standing (or stuck) at, and will gratefully take advices as to how to proceed. We will first remind of the starting point of the project, ie a mehod of isolating inter-dialectal divergence that takes into account inter-speaker variation. Than we will briefly overview the steps we did so far (LogLikelihood, Wilcoxon-Mann-Whitney test, Gries’ DP). We will then focus on the very last result we got, evaluating the observed DP value against the simluation and permutation test for the distribution of the DP in a random sample - and whether we can use it for our purposes.

30 October

Aigul Zakirova

The emphatic identity particle =OK in the Volga-Kama Sprachbund ( Aigul Zakirova)

Abstract

The particle =OK, originally Turkic, is attested in all the core members of the Volga-Kama Sprachbund: Chuvash =aχ/ =eχ, Tatar and Bashkir =uk/ =ük, Meadow Mari =ak, Hill Mari =ok and Udmurt =ik. Meadow Mari, Hill Mari and Udmurt have arguably borrowed the particle from Turkic (Bulgar). =OK is used in contexts many of which may be characterized as emphatic identity contexts: the argument marked by =OK is the same as an argument of a different proposition (≈ Russian že: Masha rabotajet v pole, Masha že sidit s det’mi ‘Mary works in the field and it is also she who sits with the children’). However, in different languages =OK exhibits different morphosyntactic restrictions on the constituent to which it may attach, i.e. in Tatar it attaches to demonstratives (šul uk keše ‘the same person’) but not to proper names (*Märijäm ük ‘it is also Mary who…’). In Chuvash, Meadow and Hill Mari and Udmurt the particle can attach to proper names. =OK can also attach to the verb, with different interpretations and again with different morphosyntactic restrictions. There are similar constructions and lexicalizations with =OK in some of the languages (e.g. reduplication construction of the type V-converb=OK V). I would like to discuss whether – and how – we can approach these similar patterns in terms of contact. From the literature we know what were the strongest bonds in the area (Chuvash-Mari, Tatar-Meadow Mari, Tatar-Bashkir). The question is whether greater and weaker similarity of =OK morphosyntactic, construction and lexicalization patterns between languages corresponds to areal affinity and how to demonstrate it.

9 October

Alexey Koshevoy

Validity of the data collected indirectly: belated proof of concept ( Alexey Koshevoy)

Abstract

Within the framework of Multidagestan project, vast amount of sociolinguistic data about traditional small-scale multilingualism was collected in Daghestan. The aim of the project is to trace the change of the multilingual patterns in the 20th century. However, 71 percent of the data were collected in an indirect way, asking people about their relatives. We will discuss the statistical methods that we used to check the robustness of the indirectly collected sociolinguistic data.

Seminar schedule 2017

21 November

Timur Maisak, Anastasia Panova

Corpus of Russian spoken in Daghestan (Timur Maisak, Anastasia Panova)

Abstract

The corpus of regional variants of Russian spoken in Daghestan is based on transcribed sociolinguistic interviews in Russian with speakers of various Daghestanian languages who live in rural areas. Technically, the corpus is built using the platform and annotation principles developed for the dialectal corpus of Ustja River Basin. The aims of the project include both its maintenance and adding new texts, as well as the use of the corpus for systematic study of morphosyntactic characteristics of Daghestanian Russian. In the talk, we plan to discuss the current state of the corpus, the possibilities of corpus-based research, as well as the problems we met and the perspectives of the project.

14 November

Olga Lyashevskaya, Ilya Chechuro

Prosodic Analysis of Non-standard Russian Spontaneous Speech (Olga Lyashevskaya, Ilya Chechuro)

Abstract

The study deals with the intonation patterns in three Non-Standard varieties of Russian. In the first part of the talk we discuss the methodological issues, such as analysis of pitch range, data filtering and data representation. In the second part, we consider a number of case studies, namely Daghestanian Russian, Southern Russian and Jewish Russian. The general goal of the project is to compile an annotated corpus of non-standard Russian with pitch annotation and explore the intonation patterns of non-standard varieties of Russian basing on corpus data.

7 November

Aleksei Fedorenko, Yury Lander, George Moroz

Circassian Isoglosses (Aleksei Fedorenko, Yury Lander, George Moroz)

Abstract

West Circassian and Kabardian languages represent a dialectal continuum spread in Krasnodar district, republics of Adygea, Karachai-Circassia and Kabardino-Balkaria. During the presentation, we are going to talk about phonetic and sociolinguistic features of different Circassian idioms, present a project of an atlas of Circassian isoglosses and show first maps, which the atlas will include. In addition, we will describe the process of creating a phonetic and grammatical questionnaire and the difficulties associated with that.

26 October

Vasilisa Andriyanets, Brigitte Pakendorf

Dialectal Variation in Even based on Corpora of Field Recordings (Vasilisa Andriyanets, Brigitte Pakendorf)

Abstract

The talk presents the “Dialectal variation in Even” project. The project works with two dialects of Even: the easternmost one spoken in the village Sebjan-Küöl in Yakutia and the westernmost one spoken in Kamchatka. The first dialect has been in contact with Yakut for a long time, while the second has possibly been in contact with Koryak and Itelmen. The aim of the project is to discover differences between the two dialects and whether they stem from independent innovations or contact. For this, we use corpora of field recordings collected by Brigitta Pakendorf in the course of 2007-2012. In this talk we will describe the data, argue what differences can be found in them and what statistical methods can be used for this, and present some differences in morphology and syntax that have been found so far. Presentation

19 October

Alexandra Kozhukhar, Olga Lyashevskaya

Universal Dependencies for Mehweb Dargwa (Alexandra Kozhukhar, Olga Lyashevskaya)

Abstract

The Universal Dependencies (UD) is a project dealing with consistent cross-linguistic morphological and syntactic mark-up. The UD is currently in version 2 and covers 52 languages with 10 more languages yet to be included. With its own annotation principles and abstract inventory for parts of speech, morphosyntactic features and dependency relations, UD aims to facilitate multilingual parser development, crosslingual learning, and parsing research from a language typology perspective. While UD covers 11 language families, it does not include any languages of the Caucasus (including the East Caucasian family). In our talk we will describe the way Mehweb Dargwa (East Caucasian) meets the UD scheme. Presentation

10 October

Anna Volkova, Michael Voronov

Spoken Meadow Mari corpus: data, design, and aims (Anna Volkova, Michael Voronov)

Abstract

The talk presents the Spoken Meadow Mari corpus project. Meadow Mari is a Uralic language spoken in the Volga region by some 375 thousand people. The core of the corpus are recordings made in 2000-2001 by a group of researchers from the Lomonosov Moscow State University. In our talk we will discuss the data we have, possible applications of the project and the target audiences of the corpus, as well as its structure. Making the corpus data presentable involves transcribing, glossing and annotating the data as well as aligning audio and text which should facilitate data analysis. Presentation

3 October

Sven Grawunder, Michael Daniel, Vasilisa Zhigulskaya, George Moroz

Daghestanian Stops (Sven Grawunder, Michael Daniel, Vasilisa Zhigulskaya, George Moroz)

Abstract

Rich consonantal inventories are a salient feature of the languages of the Caucasus on the whole and of the languages of Daghestan in particular. Their composition is a subject to a certain variation from language to language, but is overall similar. All languages have ejectives, most languages have labialization, and geminates are not uncommon. On the other hand, acoustic properties of the phonologically identical elements may be substantially different. To document these differences, in the last few years we do systematic field recordings of data from different languages. In this talk we will introduce the aims and methods of the project and will present preliminary results of our analysis of acoustic properties of ejectives as compared to corresponding voiceless stops. We will evaluate the impact of such parameters as closure duration and voice onset time. We will use the data from three different languages - Rutul (Kina dialect), Andi (Zilo dialect) and Mehweb. In the long run, we hope that our project will be able to address the following theoretical question: is the observed intragenetic variation more sensitive to areal or to systemic factors? Presentation

26 September

Nina Dobrushina, Alexandra Kozhukhar

Daghestanian Multilingualism (Nina Dobrushina, Alexandra Kozhukhar)

Abstract

The project “Atlas of Multilingualism of Daghestan” is based on sociolinguistic interviews recorded in Daghestan by the team of this project over the course of seven years. The aim of the project is to determine the level of bilingualism in Daghestanian mountain villages and describe the sociolinguistic patterns of linguistic convergence of local languages. In addition, the project allows to establish the type of linguistic contact characteristic of neighboring villages, which languages not pertaining to a particular area were spoken by inhabitants, how the command of Russian changed, what role the geographic distance between languages played and how the command of certain languages was distributed among inhabitants of a village. This talk will focus on two topics: first of all, we will show the results of a study of how the command of languages was distributed among men and women and how these dynamics have evolved since the beginning of the 20th century until now. Second, we will discuss some problems and shortcomings of the method used and we will suggest some verification methods. Presentation

19 September

Michael Daniel, Samira Verhees, Ilya Chechuro

Daghestanian Loans (Michael Daniel, Samira Verhees, Ilya Chechuro)

Abstract

The DagLoans project aims at investigating lexical convergence between East Caucasian languages and their neighbours in quantitative terms. We focus on horizontal interaction, looking at borrowings between languages that are in direct contact and dismissing influence of dominant cultures and distant languages, e.g. Arabic, Persian or Russian. The project consists of two parts - one dealing with lexical matter copy, the other with lexical pattern copy. Today we are only discussing the data of the former. We deal with lexical matter borrowing in an attempt to compare and quantify horizontal borrowing between languages at different locations. Instead of comparing standard languages, we aim at comparing local varieties and, ideally, village lects. Basing on the Leipzig-Jakarta list and issues of “Отраслевая лексика”, we attempt to compose a list of lexical items with a high borrowability rate. The list should be concise enough to be elicited from several speakers during a one day visit to a village, but, on the other hand, long enough to discriminate local varieties and, ideally, village lects. At present, we work on data from Rutul, Tsunta and Botlikh districts. In the talk on September, 19 we plan to discuss the list and its composition and elicitation techniques we use to decide whether it is an adequate tool for studying lexical contact rate at local level and how it reflects local geography and data on bilingualism. The talk is based on the data from 6 villages (4 languages) of the Rutul region that are located in the same valley: Khlut (Lezgian), Kiche (Rutul), Rutul (Rutul), Kina (Rutul), Helmets (Tsakhur), and Kusur (Avar). Presentation