With this web-application an end user can have historical Dutch texts tokenized, lemmatized and part-of-speech tagged, using the most appropriate resources (such as lexica) for the text in question. For each specific text, the user can select the best resources from those available in CLARIN, wherever they might reside, and where necessary supplemented by own lexica.
Gabmap is a free web-based application for dialectometry. It measures the differences in sets of phonetic (or phonemic) transcriptions via edit distance. Gabmap has a graphical user interface that makes string comparison facility available as a web application. This enables wider experimentation with the techniques.
The DUELME search interface provides access to the DUELME electronic lexicon, which contains more than 5,000 Dutch multiword expressions (MWEs). The search interface enables users to search for MWEs on the basis of a range of syntactic and semantic criteria, among them expression, pattern id, written form, type, conjugation, polarity, parameters, form, etc. Extensive documentation on the structure of the database is available.
TICCL (Text Induced Corpus Clean-up) is a system that is designed to search a corpus for all existing variants of (potentially) all words occurring in the corpus. This corpus can be one text, or several, in one or more directories, located on one or more machines. TICCL creates word frequency lists, listing for each word type how often the word occurs in the corpus. These frequencies of the normalized word forms are the sum of the frequencies of the actual word forms found in the corpus. TICCL is a system that is intended to detect and correct typographical errors (misprints) and OCR errors (optical character recognition) in texts.
The Transcription Quality Evaluation tool can be used to check the quality of phonetic transcription. The only thing the researcher has to do is upload pairs of files consisting of an audio file and a transcription file. After uploading he’ll receive an e-mail with the matching output.
The MIMORE tool enables researchers to investigate morphosyntactic variation in the Dutch dialects by searching three related databases with a common on-line search engine. The search results can be visualized on geographic maps and exported for statistical analysis. The three databases involved are DynaSAND (the dynamic syntactic atlas of the Dutch dialects), DiDDD (Diversity in Dutch DP Design) and GTRP (Goeman, Taeldeman, van Reenen Project).
This research tool provides information on medieval Arthurian narratives and the manuscripts in which they are transmitted throughout Europe. The tool discloses a database consists of linked records on over two hundred texts, more than thousand manuscripts and two hundred persons. The database is work in progress: a considerable number of records have yet to be completed, while fresh discoveries of narratives and manuscripts invite new entries. The compilers of the database hope that this tool will contribute to further research into Arthurian fiction as a pan-European phenomenon.
The AAM-LR web service helps researchers to annotate audio- and video-recordings. At the top level the service marks the time intervals at which specific persons in the recording are speaking. In addition, the service provides a global phonetic annotation, using language independent phone models and phonetic features. Speech is separated from speaker noises such as laughing. The output of the web service is fed into the ELAN/ANNEX editor, to facilitate further manual annotation. The annotations conform to ISOCat and potential new categories were added to ISOCat.