search catalogue

Collaborative corpus building for Sonsorolese

Landing page image for the collection "Collaborative corpus building for Sonsorolese"

Members of the Young Historians of Sonsorol with their advisors cleaning up after the Young Historians of Sonsorol Cultural Day in 2023. Photo by Imee Pedro, 2023. Click on image to access the collection.

Language Sonsorolese
Depositor Vasiliki Vita, Daphne Nestor
Affiliation SOAS University of London, Young Historians of Sonsorol
Location Palau
Collection ID 0667
Grant ID SG0687
Funding Body ELDP
Collection Status Collection online
Landing page handle


Summary of the collection

Sonsorolese (ISO 639-3: sov) describes the languages of the islands of the State of Sonsorol, in the Southwest region of the Republic of Palau, a Micronesian nation in the western Pacific. According to Eberhard et al. (2021), endangered Sonsorolese is currently spoken by less than 400 speakers. A team composed by a linguist and the Young Historians of Sonsorol worked together on building a language corpus for these lesser-documented languages, including Pulo Anna. Discussion groups, along with audio and video recordings of those sessions and of cultural practices in their social context, aim to shed light on meaning meta-discourse and create a corpus for Sonsorolese.


Group represented

The group represented in this deposit are individuals who identify as speakers of Sonsorolese in the Republic of Palau. According to Ierago, the first settlers to the island of Sonsorol (approximately 600 kilometres from the main island chain of Palau) arrived at least 500 years ago from the Northeast and the islands of Yap. Most legends agree that the first settlers on Sonsorol Island were from Mogmog. Sonsorolese is part of the Trukic (Chuukic) dialectal continuum, related to Woleaian and Ulithian. In recent years because of environmental and economic reasons, many inhabitants have moved to the port city of Koror, leading to the emergence of a vibrant community in the Echang hamlet (Tibbetts 2019). This documentation project was carried out predominantly in Echang and volunteer Young Historians of Sonsorol residing on Dongosaro (or Sonsorol) island created the dictionary database and produced some of the recordings as well.


Language information

Sonsorolese (ISO 639-3: sov) describes the languages of the islands of the State of Sonsorol; Ramari Dongosaro, spoken on the island of Sonsorol or Dongosaro, Ramari Puro, spoken on the island of Pulo Anna or Puro, and Ramari Melieli, spoken on the island of Merir or Melieli. The languages are spoken mixed on the islands of Sonsorol and Pulo Anna, while there are currently no inhabitants on the island of Merir. They are also spoken in Echang, Koror, one of the main islands of Palau, as well as the various diaspora destinations of their speakers including Guam, Saipan, Japan, Taiwan and the USA.


Special characteristics

This deposit is a result of a collaborative effort of a linguist, Vasiliki Vita, and native speaker researchers: Chelsea Pedro, Lincy Lee Marino, Daphne Nestor and the Young Historians of Sonsorol. The topics for documentation have been selected by the whole team, volunteers and speakers of Sonsorolese. What makes this collection special is that the recordings do not only document language use, but also the collaborative effort between all stakeholders involved. They also contain recordings of elicitation sessions on vocabulary conducted by native speakers with native speakers using an inductive approach, where the elicitation session is organised in the form of a lesson, prepared by the native speaker (language teacher). Other interesting recordings include recordings of a speaker of Ramari Puro, the language spoken on Pulo Anna island, that is said to be slowly disappearing.


Collection contents

In its current version, the corpus contains approximately 30 hours of recorded language use. This includes recordings of cultural practices, history and political structure of the Sonsorol islands, vocabulary elicitation and discussion sessions, Sonsorol State Legislature sessions and municipality council meetings both in Ramari Dongosaro and Ramari Puro. Additionally, it includes a PDF of approximately 3000 lexical entries extracted from a WeSay database. There are also 118 pictures, documenting the project progress, as well as pictures from various events organized by the Young Historians to showcase their work and share it with the kids in the community, and Vasiliki’s trip to the Sonsorol islands in December 2022. The naturalistic language use documented includes recordings of municipality council meetings, memories from the past, history, and evaluations of different recordings and knowledge acquired.


Collection history

The project from which this deposit originated was financed by an ELDP Small Grant (SG0687), awarded to the team for the period between October 2022 to September 2023. It is part of a wider collaborative project between Vasiliki Vita and the Young Historians of Sonsorol with the aim of prestige planning for Sonsorolese in the State of Sonsorol funded by the Onassis Foundation and Julia Sallabank’s British Academy Small Grant SRG20\200966, ‘Language Revitalisation: From Practice to Theory’. Data collection for this project began in October 2022, when Vasiliki Vita arrived in Echang, Koror. During the project months, the team started with organising the themes for the documentation and training in the technical and linguistic analysis aspect of the project, as well as lesson planning. The first batch of data was deposited with ELAR in June 2023, while another batch was deposited in February 2024.


Acknowledgement and citation

Users of any part of this collection should acknowledge Vasiliki Vita as the principal investigator, and Chelsea Pedro, Lincy Lee Marino, Daphne Nestor and the Young Historians of Sonsorol as core members of the research team. The Endangered Languages Documentation Programme is the funder of this project. Uses of parts of the corpus should acknowledge by name the people who recorded the given session, and the individuals appearing in the recordings whose words and/or images are used. Any other contributor involved in data collection, transcription and translation, or who contributed in any other way, should be acknowledged by name. The relevant information is available in the metadata.

To refer to any data from the collection, please cite as follows:

Vita, Vasiliki, Pedro, Chelsea, Marino, Lincy Lee, Nestor, Daphne and Young Historians of Sonsorol. 2023. Collaborative corpus building for Sonsorolese. Endangered Languages Archive. Handle: Accessed on [insert date here].

Powered by Preservica
© Copyright 2024