CA - Bank
In cooperation with TALKBANK, the MOVIN group makes high quality CA transcriptions and data available for CA-research. Most of the data has been edited in CLAN. CLAN is a transcription editor which allows to link transcriptions to audio and/or video files, as well as to pictures and text files. Apart from transcription functions, CLAN has extensive database facilities and can export data to other major transcription editors (ELAN, EXMARaLDA, PRAAT, ...).
An earlier MOVIN corpus exhibited small data samples from several languages. This corpus is currently being updated and will be available later in 2008.
For the time being, CA-bank features a large number of Gail Jefferson’s transcripts which have been collected in GailBank, a subset of CA-bank.
A large 20 hour corpus of Danish will be made available over the next 3 years. The corpus is funded by the Danish DK-CLARIN network grant and consists of audio and video data from institutional and mundane environments. The first parts of the corpus will be available in August 2008.
To allow advanced searching procedures, the transcriptions in the Danish corpus will follow a strict data model so data can be transferred to XML. See also the revised transcription symbols they are fully implemented into the keyboard when using CLAN.