search catalogue

An Audiovisual Corpus of Caquinte (Arawak)

Landing page image for the collection "An Audiovisual Corpus of Caquinte (Arawak) "

Landing page image for the collection “An Audiovisual Corpus of Caquinte (Arawak) “. Click on image to access collection.

Language Caquinte (Arawak)
Depositor Zachary O’Hagan
Affiliation University of California, Berkeley
Location Peru
Collection ID 0413
Grant ID IGS0282
Funding Body ELDP
Collection Status Collection online
Landing Page Handle


Summary of the collection

Caquinte is an Arawak language of southeastern Peruvian Amazonia. This collection represents the general documentation of Caquinte in the form of audiovisual recordings of historical, autobiographical, and mythological texts; lexical and grammatical elicitation; interviews; conversations; meetings; etc. Some of these materials were originally written by consultants; others were delivered orally and transcribed. The result as of December 2018 is a segmented, glossed FLEx corpus of approximately 9,750 lines, most of which is translated into Spanish, with additional transcriptions of some texts in ELAN.

The data was collected by Zachary O’Hagan as part of the research for his PhD in linguistics at the University of California, Berkeley, beginning in 2011, and regularly from 2014 forward (before the granting period). All fieldwork was carried out in the community of Kitepampani, and the principal consultants were Antonina Salazar Torres, Joy Salazar Torres, Emilia Sergio Salazar, and Miguel Sergio Salazar.


Group represented

Caquinte tradition has it that all Caquintes originally lived at the mouth of the Pogeni River. Perhaps around the middle of the 19th century, they fled to the headwaters of this river due to intense conflicts with Ashaninkas, who continued to raid these upriver settlements well into the 20th century. The deeds of famous warriors such as Taatakini, known for their bravery against Ashaninkas, date from this period. Caquintes seem to have remained isolated in the Pogeni headwaters until the 1950s, when some began to migrate over the hills into the headwaters of the Mipaya and Huitiricaya rivers. Upon the death of a prominent warrior, Shankentini, in about 1959, one extended Caquinte family moved to the Matsigenka community of Puerto Huallana (Picha River), recently formed by the Summer Institute of Linguistics. SIL undertook an expedition to the upper Pogeni in 1969, and in 1975 to the Ageni River. At the latter location, the family that had moved to Puerto Huallana returned to clear an area for a community that would become Kitepampani, where SIL members resided from the following year. The Caquintes who founded Kitepampani encouraged many of their relatives in the Pogeni basin to move, and many did. Beginning in the early 1980s, the concentration of Caquintes in Kitepampani began to radiate outwards, eventually resulting in the founding of the set of communities where Caquintes live today. The fieldwork on which this documentation project is based was conducted in Kitepampani, which as of December 2018 has a little over 100 residents. Since 2006 the petrochemical company Repsol has been operative in Caquinte territory, resulting in significant changes in material culture in the form of cash, outside goods, and cement homes, and a health post. In a similar period, the municipality of Echarate and, since 2016, Megantoni, has undertaken the construction of a primary school, community center, and a system of running water. Everyone who lives in Kitepampani speaks as their native and daily language either Caquinte or related Matsigenka.


Language information

The Caquinte language belongs to the Kampa branch of the Arawak language family. It is spoken in Peru by a few hundred people in some half-dozen communities in the headwaters of the Pogeni River (Junín region), Mipaya River (Cuzco region), and Huitiricaya River (Cuzco region). Depending on the community, the daily language of any given household may be Caquinte or related Ashaninka and/or Matsigenka.

Caquinte is a polysynthetic, headmarking, largely agglutinative language, with remarkably complex verbal morphology. Basic word order is VSO, with preverbal positions available for topics and foci. Nouns can be categorized according to gender and alienability, but, unlike other Kampa languages, not animacy.


Special characteristics

This collection is the only archival collection of materials related to Caquinte in the world. Of special note is the large FLEx corpus, allowing for the easy searching of lexical and grammatical patterns. The focus on mythological texts and interviews about traditional lifeways serves as crucial documentation of traditional Caquinte cultural practices at a time of rapid cultural changes.


Collection contents

This collection is focused on audiovisual recordings, approximately 48 hours of .wav files and 26.5 hours of .mp4 files in the genres described in the collection summary above. In addition, there are 5 hours of transcription in the form of .eaf files; an .xml export of a FLEx database of approximately 9,750 segmented and glossed lines (with most translated into Spanish) and 3,450 headwords; and some field notes.


Collection history

Depositor Zachary O’Hagan first visited Kitepampani for a one-week pilot trip in September 2011, returning for annual 8- to 12-week periods beginning in 2014. Field trips in 2014 and 2015 were funded by an Oswalt Endangered Language grant administered by the Survey of California and Other Indian Languages at the University of California, Berkeley; field trips in 2016, 2017, and 2018 were funded by ELDP. Early documentation focused on processing written versions of traditional stories, at the request of speakers, some of whom liked to record read versions at the end. Later audio and video recordings increased, as did the sorts of genres, as described in the preceding summary of the collection. The main focus of data processing has been in segmentation, glossing, and translation of texts in FieldWorks Language Explorer (FLEx). December 2018 is the date of the first collection with ELAR.


Other information

An equivalent collection, and one that will be developed further with materials beyond the 2018 field season, is available via the Survey of California and Other Indian Languages, here:


Acknowledgement and citation

Users should acknowledge Zachary O’Hagan as the original researcher and depositor of any of the materials contained in this collection. Use of the materials in this collection is strictly for non-commercial purposes only. This collection is part of an active, ongoing research project. The depositor requests that researchers interested in aspects of this collection for linguistic research contact him directly at They are strongly encouraged to consult the digital catalogue of the Survey of California and Other Indian Languages for more up-to-date information regarding this research project and collection: Citation for the latter collection is available at this link.

To refer to any data from the collection, please cite as follows:

O’Hagan, Zachary. 2018. Caquinte Field Materials. Endangered Languages Archive. Handle: Accessed on [insert date here].

Powered by Preservica
© Copyright 2024