Adelheid Tagger-Lemmatizer |
|
Enriching Data |
part of speech tagging, lemmatisation, tokenisation |
web application |
Linguistics |
general linguistics, Syntax, historical linguistics |
Language independent |
text/plain, text/xml |
released |
CLARIN-NL |
MPI for Psycholinguistics |
Frog |
v0.15 |
Enriching Data |
dependency parsing, shallow parsing, lemmatisation, morphological analysis, named entity recognition, part of speech tagging, sentence splitting, tokenisation |
web application |
Linguistics |
general linguistics, Syntax |
Dutch |
application/pdf, application/msword, text/folia+xml, text/plain |
published |
CLARIN-NL, CLARIAH-CORE |
none yet |
OpenConvert |
|
Enriching Data |
corpus processing, format conversion, text conversion, tokenisation, part of speech tagging |
local desktop |
Linguistics, Religion Studies, Communication and Media Studies, Cultural Sciences, History, Literary Studies, Philosophy, Political Studies |
|
Language independent |
text/plain, application/msword, text/html, text/xml, application/epub+zip, application/zip |
released |
CLARIN-NL, CLARIAH-CORE |
Dutch Language Institute |
TTNWW |
|
Enriching Data |
grammatical relation assignment, coreference resolution, corpus processing, dependency parsing, lemmatisation, multiword unit identification, named entity recognition, orthographic normalisation, part of speech tagging, semantic role labeling, chunking, parsing, speech recognition, speech transcription, tokenisation, up/down sampling |
web application |
Linguistics, Communication and Media Studies, History, Oral History |
discourse analysis, Orthography, Semantics, Syntax |
Dutch |
text/plain, audio/wav |
withdrawn |
CLARIN-NL |
Meertens/HuC |
Alpino (CLST web service and application) |
unknown |
Enriching Data |
parsing, dependency parsing, lemmatisation, morphological analysis, named entity recognition, part of speech tagging, sentence splitting, tokenisation |
web application |
Linguistics |
general linguistics, Syntax |
Dutch |
text/plain |
published |
CLARIAH-CORE |
none yet |
Alpino |
unknown |
Enriching Data |
parsing, dependency parsing, lemmatisation, morphological analysis, named entity recognition, part of speech tagging, sentence splitting, tokenisation |
web application |
Linguistics |
general linguistics, Syntax |
Dutch |
text/plain |
published |
CLARIAH-CORE |
none yet |
PICCL |
v0.6.4 |
Enriching Data |
optical character recognition, orthographic normalisation, sentence splitting, tokenisation, dependency parsing, shallow parsing, lemmatisation, morphological analysis, named entity recognition, part of speech tagging |
local desktop |
Linguistics, Philosophy, Literary Studies, Religion Studies, History |
general linguistics, Orthography, Morphology, Syntax |
Dutch, Swedish, Russian, Spanish, Portuguese, English, German, French, Italian, Finnish, Modern Greek, Classical Greek, Icelandic, German (Fraktur), Latin, Romanian |
application/pdf, image/tiff, text/plain, text/folia+xml, image/vnd.djvu |
published |
CLARIN-NL, CLARIAH-CORE |
none yet |
Ucto Engine |
v0.13 |
Enriching Data |
sentence splitting, tokenisation |
local desktop |
Linguistics |
general linguistics, Syntax |
Language independent |
application/pdf, application/msword, text/folia+xml, text/plain |
published |
CLARIN-NL, CLARIAH-CORE |
none yet |
Ucto |
v0.13 |
Enriching Data |
sentence splitting, tokenisation |
local desktop |
Linguistics |
general linguistics, Syntax |
Dutch, Swedish, Russian, Spanish, Portuguese, English, German, French, Italian |
application/pdf, application/msword, text/folia+xml, text/plain |
published |
CLARIN-NL, CLARIAH-CORE |
none yet |
ReLDI tag+lemma+parse |
|
Enriching Data |
tokenisation, part of speech tagging, lemmatisation, parsing |
web service |
Linguistics |
Morpho-syntax, Syntax |
Slovenian, Serbian, Croatian |
|
development |
not specified |
JSI |
ReLDI tag+lemma+NER |
|
Enriching Data |
tokenisation, part of speech tagging, lemmatisation, named entity recognition |
web service |
Linguistics |
Morpho-syntax, Syntax |
Slovenian, Serbian, Croatian |
|
development |
not specified |
JSI |
OpeNER tokenizer |
|
Enriching Data |
tokenisation |
web service |
Linguistics |
Morpho-syntax, Syntax |
Italian |
|
development |
not specified |
ILC-CNR |
Tokenizer |
|
Enriching Data |
tokenisation, sentence splitting |
web service |
Linguistics |
Morpho-syntax, Syntax |
Czech, Slovenian, Hungarian, Italian, German, English |
|
production |
not specified |
IMS: University of Stuttgart |
Stanford Tokenizer |
|
Enriching Data |
tokenisation |
web service |
Linguistics |
Orthography |
unclear |
|
production |
not specified |
SfS: Uni-Tuebingen |
Tokenizer/Sentences - OpenNLP Project |
|
Enriching Data |
tokenisation, sentence splitting |
web service |
Linguistics |
Morpho-syntax, Syntax |
unclear |
|
production |
not specified |
SfS: Uni-Tuebingen |
Tokenizer/Sentences - Alpino |
|
Enriching Data |
tokenisation, sentence splitting |
web service |
Linguistics |
Morpho-syntax, Syntax |
Dutch |
|
development |
not specified |
SfS: Uni-Tuebingen |
Tokenizer - OpenNLP Project |
|
Enriching Data |
tokenisation |
web service |
Linguistics |
Orthography |
unclear |
|
production |
not specified |
SfS: Uni-Tuebingen |
Tokenizer and Sentence Splitter |
|
Enriching Data |
tokenisation, sentence splitting |
web service |
Linguistics |
Morpho-syntax, Syntax |
unclear |
|
production |
not specified |
BBAW: Berlin-Brandenburg Academy of Sciences and Humanities |