OpenConvert |
|
Enriching Data |
corpus processing, format conversion, text conversion, tokenisation, part of speech tagging |
local desktop |
Linguistics, Religion Studies, Communication and Media Studies, Cultural Sciences, History, Literary Studies, Philosophy, Political Studies |
|
Language independent |
text/plain, application/msword, text/html, text/xml, application/epub+zip, application/zip |
released |
CLARIN-NL, CLARIAH-CORE |
Dutch Language Institute |
PICCL |
v0.6.4 |
Enriching Data |
optical character recognition, orthographic normalisation, sentence splitting, tokenisation, dependency parsing, shallow parsing, lemmatisation, morphological analysis, named entity recognition, part of speech tagging |
local desktop |
Linguistics, Philosophy, Literary Studies, Religion Studies, History |
general linguistics, Orthography, Morphology, Syntax |
Dutch, Swedish, Russian, Spanish, Portuguese, English, German, French, Italian, Finnish, Modern Greek, Classical Greek, Icelandic, German (Fraktur), Latin, Romanian |
application/pdf, image/tiff, text/plain, text/folia+xml, image/vnd.djvu |
published |
CLARIN-NL, CLARIAH-CORE |
none yet |
Ucto Engine |
v0.13 |
Enriching Data |
sentence splitting, tokenisation |
local desktop |
Linguistics |
general linguistics, Syntax |
Language independent |
application/pdf, application/msword, text/folia+xml, text/plain |
published |
CLARIN-NL, CLARIAH-CORE |
none yet |
Ucto |
v0.13 |
Enriching Data |
sentence splitting, tokenisation |
local desktop |
Linguistics |
general linguistics, Syntax |
Dutch, Swedish, Russian, Spanish, Portuguese, English, German, French, Italian |
application/pdf, application/msword, text/folia+xml, text/plain |
published |
CLARIN-NL, CLARIAH-CORE |
none yet |