A multimedia corpus of siPhuthi
|Affiliation||TU Dortmund University|
|Collection ID||0506, 0651|
|Funding Body||ELDP, Alexander von Humboldt Foundation|
|Collection Status||Collection online|
|Landing Page Handle||http://hdl.handle.net/2196/ebca9f1e-c73c-4d22-8ed8-3abcb2d51ffa|
Summary of the collection
The siPhuthi multimedia digital corpus contains primary language data of different genres recorded in different settings. The collection includes audio-video recordings from speakers of various ages depicting current use of siPhuthi. The collected modern language data will be supplemented by digitised, curated and archived audio recordings from the mid-90s (collected by Dr. Simon Donnelly), the latter allowing for a glimpse into the cultural and linguistic past of a rapidly changing and diminishing language community.
The collection contains contributions from baPhuthi who live in Lesotho and South Africa. Most baPhuthi communities are scattered and live in remote areas in two marginalised and poorly developed districts of Lesotho, namely Quthing and Qacha’s Nek. Some baPhuthi also live in the Mohale’s Hoek district in southern Lesotho, as well as in the northern Eastern Cape province of South Africa. Because of job opportunities or other incentives (e.g. better infrastructure), siPhuthi-speaking individuals and families have migrated to other parts of Lesotho, such as Maseru and Teyateyaneng (T.Y.), but also to South Africa, e.g. Rustenburg (for mine work) and Ceres (for seasonal plantation work).
The collection will also include legacy materials produced by Dr. Simon Donnelly in the mid-90s.
The collection will contain a minimum of 60 hours of recordings: 40hrs of audio-video recordings from speakers of various ages depicting current use of siPhuthi and 20hrs of audio recordings from the mid-90s. More specifically, the collection will comprise:
- 20hrs of time-aligned ELAN transcriptions, translations and annotations of audio-video recordings drawn from narratives (4hrs), interviews (3hrs), natural conversations (6hrs), direct elicitations using a diagnostic tool (6hrs) and songs (1hr), of which 12hrs will come from Daliwe and 8hrs from Sinxondo, Mohale’s Hoek and Qacha’s Nek.
- 20hrs of partially- or non-transcribed recordings (with complete metadata).
- 20hrs of legacy materials (recordings, photographs, fieldnotes) produced by Dr Simon Donnelly in the mid-90s, of which a minimum of 5hrs will be transcribed, translated and annotated. These legacy materials contain folk stories, information on traditional cultural knowledge and elicited grammatical data.
In addition, the collection will contain a quadrilingual wordlist (siPhuthi, Sesotho, isiXhosa, English) produced in FLEx consisting of lexical items generated from directed elicitations and collected texts, as well as scanned notebook pages and photographs.
Acknowledgement and citation
To refer to any data from the collection, please cite as follows:
Shah, Sheena. 2019. A multimedia corpus of siPhuthi. Endangered Languages Archive. Handle: http://hdl.handle.net/2196/00-0000-0000-0010-D126-A. Accessed on [insert date here].