search catalogue

Auslan Corpus

Landing page image for the collection "Auslan Corpus"

Landing page image for the collection “Auslan Corpus”. Click on image to access collection.


Language Auslan (ISO639-3:asf)
Depositor Trevor Johnston
Affiliation Macquarie University
Location Australia
Collection ID 0001
Grant ID MDP0088
Funding Body ELDP
Collection Status Collection online
Landing Page Handle




Summary of the collection

The Auslan Corpus Annotation Guidelines describe the annotation protocols used in the collection. They have been updated almost annually since they were first written up in 2005. The last update was in November 2016. The latest version of the Annotation Guidelines can be found at Researchers requesting access to the annotation files that are available in this collection dating from 2008 or 2012 should be aware that the later versions of the corpus are often quite different from the earlier archived versions.

It is important to note that this ELAR collection titled Auslan Corpus is a historical document. The corpus itself is changing and growing because further detailed annotation work by the depositor and colleagues is continuing and much of it has yet to be uploaded to the collection as an update. (Strictly speaking, these annotations are additional to and independent of the original documentation project.) It will take many years to complete due to the expensive, time-consuming nature of the task. Only after the depositor and colleagues have completed their program of research and publication will the additional annotations be uploaded to the ELAR site. Thus, what is referred to as the Auslan Corpus in some publications is not the same thing as the resources in this collection. Therefore, researchers who wish to access detailed annotation sets for research, or for bona fide peer review of research publications that are based on these, should contact the authors of any publication directly. Research that involves collaboration—i.e., which involves the depositor and/or colleagues in the proposed research and publication, or which contributes to the Auslan Archive and Corpus by incorporating new annotations into the corpus for eventual inclusion in the ELAR collection, or both—will be given preference.


Group represented

Auslan (Australian Sign Language) is the signed language of the deaf community in Australia. It has evolved from forms of British Sign Language (BSL) which were brought to Australia in the 19th century. It is used by an estimated 6,500 deaf people as their first or preferred language (Johnston, T. 2006. W(h)ither the deaf community? Population, genetics, and the future of Australian Sign Language. Sign Language Studies, 6(2), 137-173). The number of deaf users of Auslan appears to have peaked in the 1980s and now seems to be declining due to a variety of factors, such as aging, decreasing incidence rates of permanent early childhood severe and profound deafness, and high rates of cochlear implantation. Consequently, the number of new deaf signers being added to the community on a year by year basis is modest and the language is likely to become endangered within a generation or two.


Language information

Auslan, Australian Sign Language


Special characteristics

There are no audio files in this collection because the language being used here (Auslan) is a signed language. The movie files do have an audio track but the sound is not necessary. The people present at each recording session were all deaf and were unaware of background and environmental noise. If you download any movies from this collection, we suggest you play the movie with the sound switched off because you may find the background noise an irritating distraction.


Collection contents

The collection consists primarily of video recordings of 100 native or near-native deaf signers and if they have been created, linked ELAN annotation files. Each participant took part in three hours of language-based activity that involved an interview, retelling stories, recalling personal events, responding to a questionnaire, engaging in spontaneous conversation, and responding in Auslan to various stimuli such as a picture-book story, a filmed cartoon, and a filmed story told in Auslan. The collection recordings have been edited into clips according to the activity and catalogued in the collection according to age, gender, region (by dialect and by region) and by the text type. The annotation files that are available in the collection are the Aesop’s fables (topics: “The boy who cried wolf” and “The hare and the tortoise”). No other annotations exist for other movie files in the collection.

The collection is intended to support initial and future corpus-based grammatical description of the language and serves as a basis for comparison of this relatively old and established signed language (due to its BSL lineage) with the emerging signed languages of newly created deaf communities that can be found in the developing world.

Sections of the collection include identifying personal information or involve the participants revealing personal opinions about sensitive subjects (such as abortion, genetic screening, disability) or discuss other people in their local deaf community. Even though annotations do not exist for these files, access to such movies is conditional on applicants (i) establishing that they can meaningfully use the data (have competence in Auslan or a related sign language like BSL or NZSL) and (ii) undertaking that they will never show or make these movies available to third parties (including using such videos for teaching purposes) nor identify the individuals or their individual opinions. Unless researchers requesting access can show that they satisfy these criteria, they are unlikely to be given access to these movies.


Collection history

The recordings were made between 2004 and 2007 and deposited at the Endangered Languages Archive in late 2008. Annotation of the data during the timeframe of the ELDP funded project was extremely limited and provisional due to time and personnel constraints (annotation was done by the equivalent of only one and a half researchers full-time for less than 18 months until collection in mid-2008). The annotations are basic annotations only (sign segmentation, glossing, translation). Limited updates to some annotation files were made in 2012 and 2018.


Acknowledgement and citation

Users of any part of the collection should acknowledge Adam Schembri as a co-researcher on the project; Julia Allen (Sydney), Kevin Cresdee (Adelaide), Stephanie Linder (Melbourne), Patti Levitzke-Gray (Perth), and Kim Pickering (Brisbane) as data collectors; Trevor Johnston as the primary and on-going annotator; and in chronological order (from time of initial involvement) from first to most recent, Adam Schembri, Della Goswell, Dani Fried, Louise de Beuzeville, Karin Banna, Gerry Shearim, Julia Allen, Lindsay Ferrara, Gabrielle Hodge, Michael Gray, Ben Hatchard, Christopher Hansford, and Jane van Roekel as annotators. Users should also acknowledge the Endangered Languages Documentation Programme as the funders. We remind users that videos cannot be publicly distributed or broadcast and that participants were assured they would never be identified by name. Codes are used to refer to video clips. If by chance a user should be able to identify a participant in a video, they should never link any particular still, clip or written annotated utterance, in a presentation or publication, with the name of the person or any other identifying information.

To refer to any data from the collection, please cite as follows:

Johnston, Trevor. 2008. Auslan Corpus. Endangered Languages Archive. Handle: Accessed on [insert date here].

Powered by Preservica
© Copyright 2023