An article by members of the Linguistic Convergence Laboratory was published in the journal “Language variation and change”
From a northern village to an academic article, or How many linguists do you need to describe variation in Russian dialect?
The article’s subject matter was dedicated to variations in the Ustja (River Basin) dialect of Russian. This article holds special importance for us; it represents many years worth of work of a large group of people. In 2013 N. R. Dobrushina and M. A. Daniel took a group of students of National Research University Higher School of Economics (NRU HSE) on a dialectological field trip to Archangelskiy region. Slavicist Ruprecht von Waldenfels initiated the field trip when he was teaching at the University of Bern. He dreamt of seeing a Russian village and of taking his Swiss students with him.
As is expected of dialectologists, we began by recording our conversations with locals on a dictaphone and analyzing them. Then, during the expedition, Ruprecht created a small corpus with a search function. This corpus became the first in a series of spoken copora, and the technology and methodology used in its creation have been continually developed over time. The Ustja (River Basin) corpus has become the largest spoken Russian dialect corpus and the we have applied experience we gained from creating to other Russian dialects.
A. V. Ter-Avanesova, a dialectologist from the Russian Language Institute of the Russian Academy of Sciences (RLI RAS) began to accompany us on our expeditions, and the composition of the student team also changed a bit. Then, we received a grant from NRU HSE to form an academic-research group, and colleagues from other institutions began joining us as well (for example, our coauthors are Sergey Say and Maria Ovsyannikova from the Institute for Language Studies of the Russian Academy of Sciences (ILS RAS)).
We decided almost immediately that we would undertake not classic dialectology, but quantitative studies of variation. As in many other villages, in Pushkino (also known as Mikhalyovskaya) the dialectal features of the language are gradually disappearing and speech is becoming more and more similar to the literary standard. Between the older and younger generations there is a huge difference: elderly residents speak very differently from the young.
For example, people of the older generation often say E in place of a stressed A between soft consonants: opjet’ instead of opjat’ , ostavljejut instead of ostavljajut . The stressed syllable which once contained yat is pronounced I between soft consonants by older speakers: jizdit’ instead of jezdit’ , sjijali instead of sjejali . They say u ego instead of u nego and one instead of oni . Moreover, the speech of the older generation contains many particles whose form depends on the preceding word: Von iz chaynika-ta ljeni-ko (usually: naley) kipjatochku-tu ‘Pour some hot water from the teapot there’.
We decided to look at how some dialectical features changed from the older to younger generations of speakers: in which generation the dialectical variant began to be replaced with the literary standard, where it is lost completely, whether various features behave similarly, whether some survive longer, and more.
Such research is based on massive work with the recorded material. First, texts must be recorded, transcribed and analyzed, to be sure that there are enough examples of any given feature from every generation. The first glossing is done by students, whose work is then checked and reviewed, as they are often unacquainted with some words and concepts. One of our favorite stories is how the locals rode on sheep (Rus. baran ), when they actually rode on snowmobiles (Rus. buran ). Our Swiss colleagues were once found hotly discussing what a shaika ‘band, group, gang; (hockey) puck’ was: how did a group of bandits appear in the bath house?
Then it’s necessary to determine in which dialects a given feature is present. For example, in the village Pushkino several variants of the Russian word for ‘pike’ are attested, including shchuka (as in the literary standard), shshuka (with a hard first consonant), sh-chuka , and other intermediate variants. It is necessary to listen to all of this and make decisions about how to code each pronunciation. Without the help of A. V. Ter-Avanesova, an incredible phonetician and dialectologist, this work would have been very difficult. She taught the students how to listen to examples repeatedly and how to code them so that it would later be possible to use quantitative methods of analysis.
It is also important to calculate and draw - to be able to apply statistical models and come up with visualizations for the data. At this point in our work, we were joined by the mathematician Ilya Shchurov. Together with him and Polina Kazakova, who was then still a student in our bachelor’s program, we figured out how to calculate the speed of attrition of various dialectical features. We came up with three parameters by which the attrition rates could be compared so that the comparisons gave statistically reliable results. The first parameter is the initial level: the extent to which the feature was preserved by the oldest speakers. The second is steepness: how rapidly change occurred. Are there many speakers who use both variants of the feature, or has changed occurred so quickly that we have a situation in which the older speakers use the dialectical variant and the younger use the standard, and there is no transitional group between them? Finally, the third parameter is the turning point (we had a long discussion over what to name this parameter). The turning point is the moment when from one generation to another locals went from using predominantly the dialectical variant to predominantly using the literary standard.
We then attempted to understand what the differences between the various features told us; for example, why some dialectical features disappeared quickly while others were still in use by a large population of younger speakers. While we cannot say that we have a hard understanding of the causes, we certainly tried. We propose that this is connected with the extent to which a feature is recognized or considered dialectical by the speakers themselves. This may be determined by a feature’s linguistic conspicuousness and, possibly, by the level of language structure, as well (phonetic or morphological).
The result is an article with fifteen authors with data on eleven features. Each feature stands for a researcher (sometimes very young: among the authors are eight students).
This article is an example of a major collective project which united the efforts of many people. If we are to mention those who participated in the creation of the corpora (recorded, analyzed, and listened to the texts, published the corpora online and perfected them), then we can count many tens of people more who made a contribution to the article. In the human sciences such collaboration is not typical; even the journal’s editors tried to convince us against including fifteen authors. This is happening more and more often, however, and researchers and students in the Linguistic Covergence Laboratory and in the School of Linguistics are undertaking many such collaborative, cross-disciplinary projects which will yield articles of similar scale in the future.