Frog |
v0.15 |
Enriching Data |
dependency parsing, shallow parsing, lemmatisation, morphological analysis, named entity recognition, part of speech tagging, sentence splitting, tokenisation |
web application |
Linguistics |
general linguistics, Syntax |
Dutch |
application/pdf, application/msword, text/folia+xml, text/plain |
published |
CLARIN-NL, CLARIAH-CORE |
none yet |
Alpino (CLST web service and application) |
unknown |
Enriching Data |
parsing, dependency parsing, lemmatisation, morphological analysis, named entity recognition, part of speech tagging, sentence splitting, tokenisation |
web application |
Linguistics |
general linguistics, Syntax |
Dutch |
text/plain |
published |
CLARIAH-CORE |
none yet |
Alpino |
unknown |
Enriching Data |
parsing, dependency parsing, lemmatisation, morphological analysis, named entity recognition, part of speech tagging, sentence splitting, tokenisation |
web application |
Linguistics |
general linguistics, Syntax |
Dutch |
text/plain |
published |
CLARIAH-CORE |
none yet |
PICCL |
v0.6.4 |
Enriching Data |
optical character recognition, orthographic normalisation, sentence splitting, tokenisation, dependency parsing, shallow parsing, lemmatisation, morphological analysis, named entity recognition, part of speech tagging |
local desktop |
Linguistics, Philosophy, Literary Studies, Religion Studies, History |
general linguistics, Orthography, Morphology, Syntax |
Dutch, Swedish, Russian, Spanish, Portuguese, English, German, French, Italian, Finnish, Modern Greek, Classical Greek, Icelandic, German (Fraktur), Latin, Romanian |
application/pdf, image/tiff, text/plain, text/folia+xml, image/vnd.djvu |
published |
CLARIN-NL, CLARIAH-CORE |
none yet |
Ucto Engine |
v0.13 |
Enriching Data |
sentence splitting, tokenisation |
local desktop |
Linguistics |
general linguistics, Syntax |
Language independent |
application/pdf, application/msword, text/folia+xml, text/plain |
published |
CLARIN-NL, CLARIAH-CORE |
none yet |
Ucto |
v0.13 |
Enriching Data |
sentence splitting, tokenisation |
local desktop |
Linguistics |
general linguistics, Syntax |
Dutch, Swedish, Russian, Spanish, Portuguese, English, German, French, Italian |
application/pdf, application/msword, text/folia+xml, text/plain |
published |
CLARIN-NL, CLARIAH-CORE |
none yet |
Tokenizer |
|
Enriching Data |
tokenisation, sentence splitting |
web service |
Linguistics |
Morpho-syntax, Syntax |
Czech, Slovenian, Hungarian, Italian, German, English |
|
production |
not specified |
IMS: University of Stuttgart |
Tokenizer/Sentences - OpenNLP Project |
|
Enriching Data |
tokenisation, sentence splitting |
web service |
Linguistics |
Morpho-syntax, Syntax |
unclear |
|
production |
not specified |
SfS: Uni-Tuebingen |
Tokenizer/Sentences - Alpino |
|
Enriching Data |
tokenisation, sentence splitting |
web service |
Linguistics |
Morpho-syntax, Syntax |
Dutch |
|
development |
not specified |
SfS: Uni-Tuebingen |
Tokenizer and Sentence Splitter |
|
Enriching Data |
tokenisation, sentence splitting |
web service |
Linguistics |
Morpho-syntax, Syntax |
unclear |
|
production |
not specified |
BBAW: Berlin-Brandenburg Academy of Sciences and Humanities |