An audio-visual archive and searchable corpus of Kaike, an endangered Tibeto-Burman language of Dolpa, Nepal
|Affiliation||SOAS University of London|
|Collection Status||Collection online|
|Landing Page Handle||http://hdl.handle.net/2196/b57c6669-e7a7-41d1-8f09-7665fc535445|
Summary of the deposit
Kaike (ISO 639-3 kzq, 82E; 28N, ca. 800-1000 speakers) is an endangered Tibeto-Burman language spoken in Dolpa, Nepal. All speakers of Kaike are fluent in Nepali and Poinke Tibetan as well and Kaike is not used in writing, religious contexts or education. This project presents high-quality audio-visual materials to preserve the language for the local and linguistic community. These materials will furthermore be transcribed and translated and donated to the local community as a collection of traditional stories and practices they could use as teaching material. Finally, the texts will be morpho-syntactically annotated to provide a searchable corpus for anyone interested in the rich typology of the languages of the Himalaya or Kaike’s interesting linguistic features.
Kaike is a Tibeto-Burman language spoken mainly in four villages in Dolpa district, Mid-western Nepal. This is a remote region not accessible by road and the closest airport (Juphal) is a two-day walk. Kaike is often classified as a Tamangic language, but its exact genetic classification remains unclear (cf. Honda 2008). According to the CBS report of 2001, Kaike was spoken by less than 40% of the disadvantaged indigenous Kaike population (i.e. less than 800 of a total population of 2000 people). The National Population and Housing Census in Nepal reports there were only 50 speakers remaining in 2011, however, Ambika Regmi notes in her short grammar of Magar Kaike based on a field study in 2011 that there must be at least 1000 speakers (Regmi 2013a:1n1). The language is labelled as ‘severely endangered’ (Yadava 2004) and ‘definitely endangered’ (UNESCO, Moseley 2010), because Kaike is not written, not used in education and speakers are leaving the rural area to make a living in towns like Nepalgunj or Kathmandu. A standardised orthography, primers and teaching materials in the Kaike language are called for (Regmi 2013b:168).
Speakers of Kaike are all trilingual (Kaike, Nepali and Poinke/Tichurong Tibetan, Regmi 2013b:168). Although Kaike is still spoken at home, Nepali in particular is increasingly used for various activities, including village meetings and telling stories to children. Nepali is used exclusively in education, but also in songs and more and more other activities like bargaining. Culturally, the Kaike people are unique because they were originally Hindu, but changed to Lamaist Buddhism when it became increasingly difficult to get Hindu Brahmins from the far away village of Tibrikot to perform rites and rituals in the Kaike villages. Buddhist Lamas from neighbouring Tibetan communities were easily accessible resulting in an interesting mix of cultural and traditional components of the Kaike culture, consisting of indigenous Kaike rites of passage combined with the observation of both Hindu and Buddhist festivals.
Kaike shows a number of interesting typological features in various parts of the grammar that are not commonly found in related languages: it has 6 phonologically distinctive tones, conjunct/disjunct morphology, two types of converbal constructions (simultaneous & sequential), morphological marking of the causative and mirativity, wide-spread nominalisation for syntactic constructions, consistent ergativity, reciprocal and anti-dative marking, evidentiality and, a combination of a complex decimal and vigesimal numeral system.
Although the project aimed to collect audio-visual materials in Kaike only, in some villages speakers turn out to mix both Kaike and Tichurong Tibetan constantly. This resulting in an interesting collection of code-switched materials from mainly traditional activities in Gumbatara. Code-switched videos therefore have both Kaike and Tichurong Tibetan in the metadata.
This project has three main outcomes: high-quality audio-visual materials, a transcribed and translated collection of texts and soon, a searchable annotated corpus of all the collected data.
This is a collection of audio-visual recordings of narrative tales, descriptions of traditional local activities and conversational and task-based data from speakers of the four villages in which Kaike is spoken, as well as from speakers who have now moved to Kathmandu. Data comes from range of speakers: men and women of different ages and socio-linguistic circumstances to ensure a rich data set of Kaike.
In total, the audio-visual output will be at least 6 hours labeled as 4 different topics:
- stories (from different villages and speakers in different versions) – 2 hours
- traditional task/activity descriptions (from different villages) – 1.5 hour
- conversations (with familiar and unfamiliar speakers) – 1.5 hour
- task-based recordings (monologue and dyadic) – 1 hour
The above-mentioned 6 hours are furthermore transcribed and translated and made available through files in ELAN format. Please note that the current ELAN files still need to be checked and updated; in Spring 2019 they will be replaced by updated versions. A further 6 hours of materials (both audio and video), mainly from traditional ceremonies, landscape and village life in Lower Dolpa are added to the collection as well.
The data for this deposit was collected by Marieke Meelen with a Small Grant from ELDP in two fieldwork trips: the first in Winter 2017/2018 with Kaike speakers in Kathmandu and the second in Summer 2018 in the Kaike villages in Lower Dolpo, Nepal.
Early 2019, Marieke went back to Nepal to share materials with the Kaike community and to continue her work on developing teaching materials and a Kaike story book.
None of this data may be used as evidence in court.
Acknowledgement and citation
The collection of the Kaike materials has benefited greatly from the support of the Kaike community and in particular Jag Bahadur Budha Magar.
If you use any part of this collection, please acknowledge Marieke Meelen as the principal investigator and cite the corpus, any audio, video or transcription file in the following way:
Meelen, Marieke. 2018. An audio-visual archive and searchable corpus of Kaike, an endangered Tibeto-Burman language of Dolpa, Nepal. Endangered Language Archive. Handle: http://hdl.handle.net/2196/00-0000-0000-0010-8821-3. Accessed on [insert date here].
Please contact if you have any questions about this deposit or the Kaike language.