search catalogue

Documentation of the Baja California Yuman Languages Kumeyaay and Ko’alh

Deposit page image for the collection "Documentation of the Baja California Yuman Languages Kumeyaay and Ko'alh"

Landing page image for the collection “Documentation of the Baja California Yuman Languages Kumeyaay and Ko’alh”. Click on image to access collection.

Language Kumeyaay (ISO639-3:), Ko’alh (ISO639-3:)
Depositor Margaret Field, Amy Miller
Affiliation San Diego State University
Location Mexico
Collection ID 0357
Grant ID MDP0291
Funding Body ELDP
Collection Status Collection online
Landing Page Handle


Summary of the collection

This collection is the product of a language documentation project for two Baja California Yuman languages, Kumeyaay and Ko’alh, placing emphasis on discourse. The research team worked with native speakers to transcribe, translate, and analyze over 40 texts that we had previously collected, and used the results to expand our Ko’alh-Kumeyaay comparative dictionary database and write a grammatical sketch of Ko’alh.


Group represented

The materials in this collection represent speakers of Ko’alh and Baja California varieties of Kumeyaay. Both varieties are spoken in Baja California Norte, Mexico, and both are highly endangered. Only about 60 people speak Kumeyaay fluently in all of Baja California and fewer than ten fluent speakers remain in the U.S. Only four people speak Ko’alh fluently. All are at least 50 years old and most are elderly. Children learn words in their ancestral language (they can exchange greetings and respond to simple commands) but both children and adults use Spanish for most of their daily communication.

The Kumeyaay languages are unusual in the extent to which within even a tightly knit community the speech of one person may differ from that of his neighbors or relatives: siblings may have slightly different phoneme inventories, slight differences in rules of inflectional morphology, more noticeable differences in derived nominal or plural forms, and even glaring differences in some basic vocabulary items (see Miller 1991b, 2001, in prep). Variation across communities is of course much more pronounced, with the result that speakers of one variety have trouble translating tape recordings made by speakers who grew up only thirty miles away. In face-to-face situations, of course, difficulties are more readily resolved.


Language information

The materials in this collection document the Ko’alh and Baja California varieties of Kumeyaay. Both varieties are spoken in Baja California Norte, Mexico, both belong to the Delta-California branch of the Yuman family, and both are highly endangered; fewer than sixty people still speak Baja varieties of Kumeyaay, and exactly four people speak Ko’alh.

The Yuman languages are typologically interesting for a number of reasons. Foremost among these is lexical structure. The typical Yuman word features a root of shape (C)V(C) surrounded by any number of prefixes and suffixes. The root and the root boundary are important within the structure of the language as reference points for derivational, inflectional, and even phonological rules. Comparatively and historically, the root is the part of the word most likely to be shared across related languages and it is not at all uncommon for cognates to share only the root while differing in affixes. The prefix structure of the root allows the insertion of derivational and even inflectional prefixes. Other typologically interesting features of Yuman languages are the extent to which they make use of sound symbolism and metathesis, the fact that plural formation is (historically in all languages and synchronically in most) a derivational process, their propensity for clause chaining, their use of auxiliaries for a variety of purposes including the expression of progressive aspect, and the fact that relative clauses are head-internal.

Kumeyaay languages extend from San Diego county southwards about 150 miles into Baja CA, Mexico. On the U.S. side of the border, there are 12 federally-recognized Kumeyaay tribes. U.S. varieties are relatively well documented (see for instance Langdon 1970, Couro and Hutcheson 1973, Miller 2001, Miller and Langdon 2008), and linguistic variation across the U.S. varieties of Kumeyaay is widely acknowledged, although there is some disagreement among researchers as to whether different languages should or should not be recognized within Kumeyaay territory (see Langdon 1991, Miller 2001: 359-363, Miller 2008a, Field 2012, Miller in prep).

South of the border, varieties of Kumeyaay are still spoken in six communities: La Huerta, Juntas de Neji, San Jose de Tecate, Peña Blanca, San Jose de la Zorra and San Antonio Necua. Neighboring languages are all Yuman: Cocopa to the east, Paipai, Ko’alh, and Kiliwa to the south. (The Pacific Ocean lies to the west and U.S. Kumeyaay to the north.)

No systematic documentation had been done until we began field work in 2008, and published materials are severely limited, comprising several old vocabularies, 16 sentences cited by Langdon (1976), two very brief La Huerta texts (Hinton1976, 1978, 1984), a paper on inflection in La Huerta (Hinton and Langdon 1976), a 20-item comparative vocabulary list (Field 2012), and a paper on lexical structure and derivational morphology in Nejí (Miller in press). Our fieldwork has revealed that the six Baja California Kumeyaay communities are rich in linguistic resources and diversity and that speakers remember an oral literature which their relatives north of the border have lost. Full documentation of these speech varieties and their oral literature is needed, and because speakers are few and elderly, this work is urgent.

Ko’alh, not yet widely recognized as a distinct Yuman language, is closely related to but mutually unitelligible with Kumeyaay. Ko’alh is spoken in Santa Catarina, Baja California, about 100 miles from the U.S. border. It is a minority language even in Santa Catarina, where most people speak the distantly related Yuman language Paipai. Neighboring languages are Yuman: Kumeyaay to the north, Cocopa to the northeast, and Kiliwa to the south.

For the sake of completeness we note that a speech variety called K??a?? has been partially documented (in the form of a 594-sentence questionnaire and phonological sketch by Mixco (n.d.). Mixco regards K??a?? as a variety of Kumeyaay, and a study of his manuscript shows that he is quite right to do so. However, the K??a?? of Mixco’s manuscript is by no means the same language as the Ko’alh that we propose to document; the two differ in phonology, morphology, syntax, and lexicon, as detailed in the following paragraph. Speakers of Ko’alh state emphatically that they are not Kumeyaay, and Kumeyaay consultants are unable to translate Ko’alh data and explain that Ko’alh is “not my language.”

Differences between Ko’alh and Kumeyaay (the latter including Mixco’s K??a??) pervade all levels of the grammar. The following are described by Miller (2010 ms): (1) In the phonology, Ko’alh has three contrasting sibilant phonemes: /s/, /?/, and /š/, while Kumeyaay languages have /s/ and (depending on the language) either /?/ or /š/. While Kumeyaay languages have four contrasting laterals /l/, /l?/, /?/, and /??/, palatalized and non-palatalized laterals have been neutralized in Ko’alh, with only the non-palatalized versions remaining. This neutralization has pulled the conditioning environment out from under what had been mid vowel allophones of low vowels, allowing mid vowels to be re-analyzed as phonemes, and the resulting vowel system has five vowel qualities (plus a length contrast). The vowel system of Ko’alh thus now resembles that of Paipai (see Joel 1966). (2) In addition to its divergent phonemic system, Ko’alh has a lexicon quite distinct from that of Kumeyaay, with many items borrowed from Paipai or Kiliwa. (3) Aspects of Ko’alh syntax and morphosyntax (including switch reference, auxiliary constructions, and an entire range of modal suffixes and particles) appear to differ considerably from their Kumeyaay counterparts. Further research will shed more light on these areas.


Collection contents

When completed, the collection will include

  • a body of analyzed texts of discourse data (20 hours, 42 narratives) including ethnographic and ethnobotanical information as well as traditional literature, with time-aligned annotations of practical orthography, morpheme-by-morpheme analysis, morpheme-by-morpheme gloss, translation into Spanish and English
  • a comparative Kumeyaay-Ko’alh database with lexical data and example sentences from the text corpus, built in Toolbox, with distinct Toolbox databases for each variety analysed, with each lexical entry including a main entry in phonemic form (in practical orthography), variant forms, notes on variation, keyword glosses in English and Spanish, full glosses in English and Spanish, example sentences in practical orthography with translation in English and Spanish, notes on phonetics where applicable, morphophonemic forms in the standardized Yuman code developed by Margaret Langdon, intermediate reconstructions, speech variety, and information on contributor, collector and recording
  • a grammatical sketch of Ko’alh (phonology, lexical and derivational morphology, inflection, syntax, discourse) with illustrations primarily from examples gathered from the texts
  • metadata including speakers’ birthdates and places, dialects and languages spoken, parents’ names, and speech event description

In addition, the orthography guide to the Kumeyaay and Ko’alh writing is available here as part of the collection:


Special characteristics

The materials in this collection are invaluable for a better understanding of Yuman prehistory. Since the locus of the greatest diversity in Yuman languages is to be found in Baja California (where the most divergent Yuman languages are located), it is possible that Baja CA predates the Colorado river area as a homeland. A better understanding of Ko’alh and Mexican varieties of Kumeyaay will go far toward answering questions about the role of Baja California in Yuman prehistory.


Collection history

We have been engaged in research on Ko’alh and Baja California varieties of Kumeyaay since 2008. Margaret Field was awarded a National Science Foundation grant for field work on Baja California Kumeyaay (“Language Revitalization and Documentation of Kumeyaay Spoken in Baja CA”) in 2008. This grant was funded at a fraction of the requested level, and we agreed to focus on data collection and revitalization efforts while deferring much of the transcription and analysis until the future.

With the help of SDSU graduate students, we collected approximately 130 hours of digital audio and hi-definition video from five Baja Kumeyaay communities: San Jose de la Zorra (15 hours), La Huerta (25 hours), Neji (65 hours), Tecate (25 hours) and 30 mins (two stories) from San Antonio Necua. In Santa Catarina we similarly collected approximately 100 hours of Ko’alh data. Our recordings include approximately 20 hours of naturally occurring discourse: 42 texts including traditional stories, personal and local history, and task-based and hortatory discourse connected to traditional activities such as basket making, pottery, food collection and preparation, childbirth, and traditional uses of plants.

Faced with five divergent Kumeyaay dialects and a wholly distinct language Ko’alh, Amy Miller managed to analyze the phonemic systems of four of the six communities and to transcribe and enter into the Toolbox database about half of the Ko’alh lexical data and approximately 25% of the Kumeyaay lexical data. Time constraints allowed work on only three texts, with the result that 39 of the 42 texts already gathered remained to be transcribed, translated, and analysed. The research on these texts formed part of the work conducted for the project which led to the collection archive here.

From 2008-2012, with funding from the NSF, Field hosted regular workshops, developing a practical orthography for Baja Kumeyaay and as well as teaching immersion methodology, instruction and practice in using the orthography. These have been successful, with the result that our consultants were eager to read sample dictionary pages and regularly provided helpful feedback. One consultant kept a notebook, written in Kumeyaay, in which she wrote stories and records items of interest that arose during fieldwork. Field also trained and distributed digital recorders to two community members so that they could record elder speakers in their communities (Neji and La Huerta) as well, and they made important recordings for our collection.

Since NSF funding was exhausted in 2012, we continued fieldwork at our own expense, supplemented by a small grant from the Sycuan Institute on Tribal Gaming and Field’s stipend from the Fulbright-Garcia Robles program. As a Fulbright-Garcia Robles Border Scholar, Margaret Field collaborates with linguists from Mexico’s Instituto Nacional de Anthropologia y Historia. The Instituto Nacional de Anthropologia y Historia has agreed to publish pedagogical materials for the community, including traditional stories in book form and multimedia language teaching lessons.

New data collection as well as research on the existing texts were conducted between 2014 and 2016 as part of the research for the ELDP-funded Major Documentation Project awarded to Margaret Field. As part of this project, we focused on translating and analyzing as many as possible of the 39 remaining narratives (approximately 20 hours of discourse) and elicited lexical and grammatical information to help us in text analysis and to flesh out the Ko’alh grammatical sketch.


Other information

In addition to archiving at ELAR, we have collaborated with the Archive of Indigenous Languages of Latin America (AILLA), University of Texas, Austin, to archive all of our digital discourse data (both audio and video). The data will eventually all be available (much of it already is) to anyone wishing to view it, including academics and tribal communities.


Acknowledgement and citation

To refer to any data from the collection, please cite as follows:

Field, Margaret. 2018. Documentation of the Baja California Yuman Languages Kumeyaay and Ko’alh. Endangered Languages Archive. Handle: Accessed on [insert date here].

Powered by Preservica
© Copyright 2024