search catalogue

Hadza: an archive of language and cultural material from the Hadzabe people of Eyasi (Arusha, Manyara, Singida, and Simiyu regions, Tanzania)



Landing page image for the collection 'Hadza: an archive of language and cultural material from the Hadzabe people of Eyasi (Arusha, Manyara, Singida, and Simiyu regions, Tanzania)'

Landing page image for the collection ‘Hadza: an archive of language and cultural material from the Hadzabe people of Eyasi (Arusha, Manyara, Singida, and Simiyu regions, Tanzania)’. Click on image to access collection.

Language Hadza (ISO639-3:hts)
Depositor Andrew Harvey, Richard Griscom
Affiliation Leiden University
Location Tanzania
Collection ID 0597, 0618
Grant ID IPF0285, IPF0304
Funding Body ELDP
Collection Status Collection online
Landing Page Handle


Blog Post




Summary of the collection

Hadza [hts] is an endangered language isolate spoken by a community of approximately 1,000-2,000 people in northern Tanzania. This deposit contains audio-visual material collected by members of the Hadza speech community and researchers, and it constitutes the first and only open access documentary record of the Hadza language.

Community participation was central to the creation of this collection. The majority of data in the deposit was collected by community members themselves, following their own interests and goals for the documentation of their language and traditional cultural practices. The resulting documentary record is diverse both in terms of content and participants involved and reflects a wide spectrum of individuals, places, and perspectives.


Group represented

The Hadzabe people and their ancestors are believed to have lived in the same region of Tanzania for multiple millennia, but within the first two decades of the 21st century, their traditional way of life has become severely threatened by wildlife and habitat loss, exploitative tourism, and the encroachment of other ethnic groups into their traditional territory. As the number of Hadzabe who have permanently relocated to urbanized areas continues to increase each year, language endangerment and loss of traditional ecological knowledge are unfortunate realities. More than two-thirds of speakers have adopted sedentary non-foraging lifestyles in multi-ethnic villages bordering the wilderness, resulting in a rich diversity of lifeways and experiences.

The Hadza language is believed to be unrelated to any other language, and is thus of significance to the study of the boundaries and possibilities of human cognition and the language faculty. Hadza is also considered by many to reflect the linguistic history of East Africa prior to the arrival of other ethnic groups and provides a unique opportunity to learn about language use in the context of a nomadic foraging ecology that resembles that of early humans.


Language information

Hadza is a language isolate spoken by up to 1,000 people in northern Tanzania (Mous 2003, Miller 2017, Edenmyr 2004, Blurton Jones 2016). Many of the Hadza are still practicing hunter-gatherers, and as such they live in very remote areas. Hadza has a number of notable linguistic features that make it especially significant for the scientific study of human language. The consonantal inventory is extensive, including three series of clicks, plosives and ejectives, and central and lateral affricates (Sands et a. 1996). All numerals over ‘two’ are borrowed (Miller 2008), and names for animals are distinct when spotted during hunting and when announcing a kill (Miller 2008), similar to Aasáx, another language of Tanzania (Petrollino and Mous 2010).


Special characteristics

This deposit is the first open access documentary record of the Hadza language and includes the first recordings produced by members of the Hadza community. The first release of the deposit makes legacy materials from Bonny Sands’ fieldwork in the 1990s openly available for the first time, provides the first openly accessible time-aligned transcriptions and translations of Hadza recordings, and includes the first mobile recordings of traditional foraging activities using action cameras.


Collection contents

The contents of the collection consist of bundles of audio-visual recordings, transcriptions and translations, parsing and glossing, and fieldnotes. Each bundle groups together all of the materials associated with a particular recorded speech act, and, in the case of fieldwork notes, each digitized notebook has its own bundle.

Users can search for items in the deposit by entering text into the search bar on the top left. This searches across the titles, descriptions, and other metadata for each bundle in the collection, such as genre, topic, keywords, participants, and places.

Users can also explore subsets of the collection by selecting metadata values on the left, such as Type, Genre, Topic, or Participants. When viewing a list of bundles or an individual bundle, there are keywords visible for each bundle which can also be selected and used to explore bundles with similar content. The current Types, Genres, and Topics of the deposit are listed here:

  • Types
    • Audio (.wav)
    • Video (.mp4)
    • ELAN (.eaf)
    • FLEx (.flextext)
    • Praat (.TextGrid)
    • Document (.pdf, .csv)
    • Genres
  • Conversation – A dialogue between two or more speakers.
    • History Lived – A historical account of events that the speaker experienced.
    • History Received – A historical account of events that the speaker did not experience.
    • Fictional Story – A non-historical story.
    • Ritual Text – Speech associated with traditional
    • Singing – Musical performances with and without instrumentation
    • Demonstration – A demonstration of a traditional practice
    • Tongue Twisters – A sequence of words or phrases that are difficult to produce quickly
    • Elicitation – Linguistic elicitation of targeted speech data for research purposes


Collection history

There are two primary sub-collections within the data, produced by the following contributors:

1. Hadza community members, Andrew Harvey, and Richard Griscom (2019-2021)

Audio-visual recordings, rough transcriptions, and Swahili translations of a wide range of speech acts, genres, speakers, etc., produced by the following teams of Hadza community members:

  • Domanga – Angela Jackson, Nange Chaka
  • Mongo wa Mono – Endeko Simon, Beatrice Simon
  • Kipamba – Elizabeth Minja, Naftali Mosses, Jakobo Lubumba
  • Mang’ola – Mariamu Anyawire, Bunga Paulo

Audio recordings, orthographic transcriptions, and translations to Swahili and English of linguistic elicitation sessions produced by the researchers Andrew Harvey and Richard Griscom. Audio-visual recordings of traditional foraging activities produced by Andrew Harvey and Richard Griscom.

2. Bonny Sands (1992, 1997)

Audio recordings of elicited materials, narratives, conversations, songs, and tongue twisters, produced by Bonny Sands. Digitized fieldnotes produced by Bonny Sands.

Release history

February 2021 – The first release of the deposit was created, including new materials produced by community members, Andrew Harvey, Richard Griscom, and legacy materials from Bonny Sands.


Acknowledgement and citation

Any publication using the contents of the collection should include the following citation:

Griscom, Richard & Andrew Harvey. 2021. Hadza: an archive of language and cultural material from the Hadzabe people of Eyasi (Arusha, Manyara, Singida, and Simiyu regions, Tanzania). Endangered Languages Archive. Handle: Accessed on [insert date here].

In addition, any use of a specific bundle within the collection should include the unique bundle citation, as described in the bundle description.

We wish to thank the Hadza community for their contributions to this deposit and would like to specifically acknowledge the tireless efforts of the Hadza local researchers Angela Jackson, Nange Chaka, Elizabeth, Jakobo Lubumba, Endeko Simon, Mariamu Anyawire, and Bunga Paulo. Thanks to Daudi Peterson for guidance during the planning and implementation of the project, as well as the researchers Bonny Sands, Kirk Miller, and Jeremy Coburn for their invaluable feedback. The creation of this deposit was funded by the Endangered Languages Documentation Programme (IPF 0285, IPF 0304), and related research activities were approved by the Tanzanian Commission of Science and Technology (COSTECH).

Powered by Preservica
© Copyright 2023