Publications
The chapter surveys applicative constructions in the languages of the Northwest Caucasian family.
P(reposition)-stranding is typologically rare. Nevertheless, many languages exhibit phenomena that look like P-stranding (Campos 1991, Poplack, Zentz, and Dion 2012) or involve P-stranding under common theorizing (see Philippova 2014 and references therein). These studies argue that these are not instances of P-complement movement and provide alternative analyses. This squib addresses Russian prepositions that can be postposed to and apparently stranded by their dependents. They are proposed to be PPs rather than P-heads, with dative dependents adjoined similarly to external possessors. The analysis captures all idiosyncrasies of their nominal dependents and alleviates the need to posit exceptional P-stranding in Russian.
In this paper, we address the issue of reliability of quantitative data on multilingualism of the past obtained as recall data. More specifically, we investigate whether the interviewees’ assessments of the language repertoires of their late relatives (indirect data) provide results that are quantitatively similar to those obtained from the people of the same age range themselves (direct data). The empirical data we use come from an ongoing field study of traditional multilingualism in Daghestan (Russia). We trained machine learning models to see whether they can detect differences in indirect and direct data. We conclude that our indirect quantitative data on L2 other than Russian are essentially similar to direct, while there may be a small but systematic underestimation when reporting other’s knowledge of Russian.
This paper focuses on the noun phrase in Tanti Dargwa (East Caucasian) and presents evidence for the distinction between modifiers proper (adjectival phrases, participial relative clauses and non-genitive adnominal NPs) and determiner-like elements (demonstratives, indefinite pronouns, numerals and most quantity expressions) in this language. Crucially, this dichotomy, which presumably reflects the distinction between the determinative and descriptive components in the NP, is realized in Tanti Dargwa mostly morphologically – in the distribution of “attributive markers” and in the expression of number. Syntactically, in the most neutral constructions the order of elements other than the head is virtually free and does not display any scope-related effects, while the head occupies the final position. In addition, Tanti Dargwa shows marginal constructions (a right-periphery construction locating a modifier after the head and a construction showing quasi-incorporation of a modifier into the noun) which are restricted to modifiers. Tanti Dargwa data support the idea that the description/determination distinction is gradual rather than discrete, as there are elements that show behavior intermediate between modifiers proper and determiner-like elements: possessor NPs, contrastive modifiers and the expressions like ‘other’.
This paper describes the Shughni Documentation Project consisting of the Online Shughni Dictionary, morphological analyzer, orthography converter, and Shughni corpus. The online dictionary has not only basic functions such as finding words but also facilitates more complex tasks. Representing a lexeme as a network of database sections makes it possible to search in particular domains (e.g., in meanings only), and the system of labels facilitates conditional search queries. Apart from this, users can make search queries and view entries in different orthographies of the Shughni language and send feedback in case they spot mistakes. Editors can add, modify, or delete entries without programming skills via an intuitive interface. In future, such website architecture can be applied to creating a lexical database of Iranian languages. The morphological analyzer performs automatic analysis of Shughni texts, which is useful for linguistic research and documentation. Once the analysis is complete, homonymy resolution must be conducted so that the annotated texts are ready to be uploaded to the corpus. The analyzer makes use of the orthographic converter, which helps to tackle the problem of spelling variability in Shughni, a language with no standard literary tradition.
In this paper, I consider double causatives in Mehweb, a one village language spoken in Daghestan, Russia, and belonging to the Dargwa branch of East Caucasian. The capability of stacking two causative suffixes seems to be lexically restricted, and mapping onto verbal meanings that are typically P-labile in the languages of the family. Interestingly, the verbs allowing double causatives are not morphosyntactically labile in Mehweb, which is generally poor in labile verbs as compared to sister languages. I conclude that the ability to form double causatives is not a consequence of the morphosyntactic property of being labile; rather, both morphosyntactic properties follow from the same component of the lexical semantics of these verbs and ultimately from the properties of the situational concepts they convey. As a tentative functional explanation I suggest that the relevant property is the weakened status of the agentive participant.
The widespread Uralic family offers several advantages for tracing prehistory: a firm absolute chronological anchor point in an ancient contact episode with well-dated Indo-Iranian; other points of intersection or diagnostic non-intersection with early Indo-European (the Late Proto-Indo-European-speaking Yamnaya culture of the western steppe, the Afanasievo culture of the upper Yenisei, and the Fatyanovo culture of the middle Volga); lexical and morphological reconstruction sufficient to establish critical absences of sharings and contacts. We add information on climate, linguistic geography, typology, and cognate frequency distributions to reconstruct the Uralic origin and spread. We argue that the Uralic homeland was east of the Urals and initially out of contact with Indo-European. The spread was rapid and without widespread shared substratal effects. We reconstruct its cause as the interconnected reactions of early Uralic and Indo-European populations to a catastrophic climate change episode and interregionalization opportunities which advantaged riverine hunter-fishers over herders.
We study the correlation between phylogenetic and geographic distances for the languages of the Andic branch of the East Caucasian (Nakh-Daghestanian) language family. For several alternative phylogenies, we find that geographic distances correlate with linguistic divergence. Notably, qualitative classifications show a better fit with geography than cognacy-based phylogenies. We interpret this result as follows: The better fit may be due to implicit geographic bias in qualitative classifications. We conclude that approaches to classification other than those based on cognacy run a risk to implicitly include geography and geography-related factors as one basis of genealogical classifications.
We outlined in chapter 1 the goals of this survey of number across a diverse sample of languages: investigate the properties of number systems in some depth, while at the same time guaranteeing direct comparability between the analyses of systems that can be very different from each other. At the conclusion of this survey, it is appropriate to consider what picture emerges from it. We will structure our answer in three steps. First, in section 2, we briefly review the main results arising from the chapters in this volume, and we organize the typology of number values that emerge from them. Most of these considerations broadly confirm what is generally known (or assumed) about the expression, content, and variation space of number systems across natural languages. The next sections, from 3 to 7, discuss in greater depth and in a more analytical perspective some important themes, especially focusing on issues that can shed new light or contribute to current understanding of number systems. Finally, section 8 offers a wrap-up discussion and points to desiderata that arise from these studies.
The volume is devoted to the typology of the category of number in the world's languages.
The chapter provides a detailed description of the expression of number in West Circassian.
The paper describes expressions with the meaning ‘other’ in East Caucasian (Nakh-Daghestanian) languages. It is shown that four main strategies can be distinguished: i) the ‘one’-based strategy: ‘other’ includes the numeral ‘one’; ii) the demonstrative-based strategy: ‘other’ includes a demonstrative pronoun; iii) the mixed demonstrative-based + ‘one’-based strategy: ‘other’ includes both a demonstrative and the numeral ‘one’; and iv) the lexical strategy: ‘other’ is a dedicated adjective (pronoun), not necessarily derived from any other clearly discernable source.
In this paper, we provide a survey of the diachronic development of the Russian second genitive (Gen2). As endpoints of this development, we consider data from Russian dialects representing different dialect groups. Assumedly, the expansion of Gen2 started off as ‘recycling’ of the genitive of a declension type that became obsolete already in the prewritten period. Nouns of this declension type were adopted by another declension, carrying their old genitive over as a variant form. This alternative ending started spreading, always as a variant, to other nouns in the adoptive declension. As the survey of the literature shows, in the course of this expansion new constraints evolved, including phonological, morphophonological, phonotactic, syntactic and semantic conditioning. While there is no declension class or even individual nouns where Gen2 became the only option, it expanded to different extents in different dialects. We believe that the diversity of functions associated with the form, the range of language-internal factors driving its expansion, as well as the current geographic distribution of constraints on its formation weaken the claim that emergence of Gen2 as a morphological category dedicated to partitive was due to contact with the languages of Circumbaltic area, a suggestion made on a macro-areal basis and also based on comparison with the northern dialects alone. While we cannot argue that the data we present disproves the contact factor, we would at least expect that the increased granularity of dialectal data would provide some data to support it. This is not what happens, which we consider to be an argument against contact-induced change. The aim of the paper is two-fold, to present a synopsis of the discussions of the history of Gen2 and a survey of the data on the use of Gen2 in the dialects, both firsthand and available from the literature; and to question the role of contact in the emergence of the new category of Gen2 in Russian.
This paper develops the claim that head properties arise (at least) due to one of the three factors: (i) the higher position of an element in a compositional structure, (ii) the informational prominence, and (iii) the development of a construction from an appositive(-like) structure. These factors are logically independent and may lead to the assignment of head properties to different elements of a construction. As a result, it is more accurate to speak not of the heads but rather of head effects, which may – but need not – concentrate around a single component of a construction.
Efficiency is central to understanding the communicative and cognitive underpinnings of language. However, efficiency management is a complex mechanism in which different efficiency effects—such as articulatory, processing and planning ease, mental accessibility, and informativity, online and offline efficiency effects—conspire to yield the coding of linguistic signs. While we do not yet exactly understand the interactional mechanism of these different effects, we argue that universal attractors are an important component of any dynamic theory of efficiency that would be aimed at predicting efficiency effects across languages. Attractors are defined as universal states around which language evolution revolves. Methodologically, in contrast to many previous, language-specific studies on efficiency, we approach efficiency from a cross-linguistic perspective on the basis of a world-wide sample of 383 languages from 53 families, balancing all six macro-areas (Eurasia, North and South America, Australia, Africa, and Oceania). We focus on the grammatical domain of verbal person–number subject indexes. We claim that there is an attractor state in this domain to which languages tend to develop and tend not to leave if they happen to comply with the attractor in their earlier stages of evolution. The attractor is characterized by different lengths for each person and number combination, structured along Zipf’s predictions. Moreover, the attractor strongly prefers non-compositional, cumulative coding of person and number. There are two domains in which efficiency pressures are most powerful: strive towards less processing and articulatory effort while increased lexicon complexity and memory costs are weaker efficiency pressures for this grammatical category given its order of frequency. Constant information flow overrides articulatory efficiency.
West Circassian has no less than two imperatives and two optatives. Their distribution depends on various parameters such as the speaker’s control over the situation, the person of the topic and the type of a predicate. The whole system arguably can be described with respect to the universal Imperative Prototype, which reflects grammaticalization of a discourse function.
This study examines the passive construction with intransitive motion verbs in Kazym dialect of Khanty based on a survey of the speakers from Kazym village, Khanty-Mansi Autonomous Okrug, Russia (2019-2020). The aim of the study is to describe the mechanism of motion passivization and to explain the underlying properties of the motion event and its participants. The study reveals that motion passive combines only with goal-oriented verbs. It raises the animate possessor of the goal affected by the event while demoting the trajector. Typologically it is similar to adversative constructions, however it is less grammaticalized and more lexically restricted.
This paper describes some aspects of the West Circassian Corpus, an electronic resource containing annotated texts in West Circassian (Adyghe). We focus on some ways in which the corpus can serve as an important instrument for teaching West Circassian. In particular, it is shown that the West Circassian Corpus can be used for compiling the exercises, for checking the actual use of the language as well as for organizing small-scale research projects devoted to the functioning of West Circassian. The paper contains some examples of the use of the West Circassian Corpus for teaching purposes.
In this paper, I discuss the system of future tense forms in Ulcha (Tungusic) in a language shift perspective. In Ulcha, the future semantic domain is covered by three main forms, i.e. two dedicated future tense forms and a present tense form. The first objective of the paper is to reconstruct the initial system of forms with reference to future (i. e. the system attested before the language shift). Based on data from early texts, I describe their distribution and competition within the future semantic domain. Three main forms compete in this domain. These are present forms, which are actively used in Ulcha both with reference to present and with reference to future, and two dedicated future tense forms. Both forms have additional modal semantic nuances. What is especially important for this study is that these forms are chosen on different grounds, therefore the range of contexts in which both forms are acceptable is quite wide. The second objective is to analyze how the use of future tense forms changes in a situation of language shift. This part of the study is based on comparative textual data (those coming from early archive texts and those coming from our recent field recordings), as well as on elicited data received from contemporary speakers. As expected, the frequency of non-specialized (present) forms increases, while the frequency of dedicated future tense forms becomes lower. At the same time, contrary to expectations, no notable semantic changes take place. Dedicated future tense forms are used by contemporary speakers more rarely, but exactly in the same contexts as before the language shift. I also consider a related morphological topic, i.e. regularization of the person-number paradigm in one of the future forms. The morphological data show a similar picture: the regularization process is indeed attested in the modern Ulcha speech. However, this process appears to have started before the language shift.
Here we use computational Bayesian phylogenetic methods to generate a phylogeny of Tungusic languages and estimate the time-depth of the family. Our analysis is based on a dataset of 254 basic vocabulary items collected for 21 Tungusic doculects. Our results are consistent with two previously proposed basic classifications: variants of the Manchu-Tungusic and the North-South classification. We infer a time-depth between the 8th century BC and the 12th century AD. The application of Bayesian phylogenetic methods to Tungusic languages is unprecedented and provides a reliable quantitative basis for previous estimates based on classical historical linguistic and lexicostatistic approaches.
Bjorkman & Zeijlstra (2018) claim that agreement with the absolutive argument in ergative-absolutive languages follows naturally in an Upwards-Agree (UA) system supplemented by the relation of Accessibility if 𝜙-agreement is parasitic on structural case assigned to the absolutive noun phrase either by T or by v. By drawing evidence from two distantly related East Caucasian languages, Mehweb and Avar, this article shows this view to be erroneous while also fine-tuning the consensus view, due to Legate 2008, of the nature of the absolutive case. I then show that the problematic facts are trivially analyzable with standard Agree (Chomsky 2000 et seq.).