Using the EOPAS system

EOPAS is provided both as an online system for playing media together with interlinear glossed text (IGT), and as an open-source downloadable system that you can install to run your own EOPAS instance.

Overview

The current working instance is hosted by the School of Languages and Linguistics at the University of Melbourne and was built as part of a project run by Nick Thieberger (see the bottom of this page for the list of CIs in the project). Go to the current working version on the EOPAS website. See also details of the System Architecture.

Users of EOPAS fall into one of two categories. An ordinary user clicks an agreement and has access to view and play accessible content. A registered user can upload their own material to the collection.

To get your own material into EOPAS you will need a media file and a time-aligned transcript in one of the following formats: Toolbox, Transcriber, or Elan.

Specific requirements for each of these formats are outlined below.

You will need to register as an EOPAS user to be allowed to upload your own material.

We do not guarantee hosting this material in the long-term and encourage you to have lodged primary data in a suitable (and preferably Open Language Archives Community (OLAC) compliant) language archive.

Media formats

You can upload various formats of media files (including wav, mp3, mov, mp4) and they will be transcoded to the (ogg) streaming format (future plans include transcoding to WebM and MP4). Please be patient as it can take a few minutes to complete this process.

Transcript formats

In order to be uploaded to EOPAS, transcripts must have timecodes. The name of the media file should not contain any spaces, followed by a space and no punctuation, then the start time (in seconds, followed by a dot, followed by decimal fractions of a second), followed by a space and no punctuation, and the end time (in seconds, followed by a dot, followed by decimal fractions of a second).

Toolbox

Toolbox text files with time codes included have to be exported using the Toolbox export to XML format.

Toolbox XML files can be uploaded to EOPAS if they conform to the marker hierarchy indicated in this image.

EOPAS toolbox format

A template for a Toolbox project that includes the correct hierarchy is available: EOPAS Toolbox template (with thanks to Sebastian Drude for corrections). The specification for the Toolbox structure that allows a correct XML export is as follows:

The record marker is \itm.

Each sentence has an id (\id), followed by an \aud marker in which the name of the media file is followed by a space and no punctuation and the start time, followed by a space and no punctuation and the end time (in seconds and milliseconds).

The next lines are the text (\tx), morpheme (\mr), gloss (\mg) and free gloss (\fg).

The exported XML file can be validated against the schema.

Transcriber

A Transcriber file should have a single line of transcription only and no other tiers. This means that the transcript will only be in the language and will not have all the interlinear features of a Toolbox file. We are providing for outputs of Transcriber up to version 1.5.2 (version 1.6 looks like it will have different formats that will be dealt with in future versions of EOPAS).

The Transcriber (.trs) file can be validated against the schema.

Elan

Elan files should have a single line of transcription only.

The Elan (.eaf) file can be validated against the schema.

Funded by the former Institute for a Broadband-Enabled Society (IBES) now the Networked Society Institute.

Chief Investigators: Nick Thieberger, Linguistics, the University of Melbourne; Rachel Nordlinger, Linguistics, the University of Melbourne; Cathy Falk, Music, the University of Melbourne; Steven Bird, Computer Science and Software Engineering, the University of Melbourne; Linda Barwick, PARADISEC, University of Sydney.