Texcavator

Description

Texcavator enables a researcher to use full-text search on the newspaper archive of the Dutch Royal Library. On top of that, it allows for visualizations like word clouds, time lines and heat maps. It also provides services to enhance your search experience like filtering, stopword removal, normalization and stemming. Texcavator also gives access to ShiCo (Shifting Concepts), developed by Carlos Martinez Ortiz (NL eScience Center).ShiCo is a tool for visualizing time shifting concepts. We refer to a concept as the set of words which are related to a given seed word. ShiCo uses a set of semantic models (word2vec) spanning a number of years to explore how concepts change over time -- words related to a given concept at time t=0 may differ from the words related to the same concept at time t=n . Texcavator originated from the earlier text mining applications WAHSP and BiLand. During the Translantis project, the application was renamed to Texcavator and further developed by the UvA (Fons Laan). In May 2014, development was taken over by the Netherlands eScience Center (Janneke van der Zwaan). From April 2015 onwards, Texcavator was developed at the Digital Humanities lab of Utrecht University (Julian Gonggrijp and Martijn van der Klis). ShiCo was created in cooperation with the NL eScience Center (Carlos Martinez Ortiz).

Project

Translantis

CLARIN National Project

CLARIN centre

Huygens ING

Research domain

Tool task

Country

Netherlands

Tool Type

Research Phase

Tool status

Output format

Input Language

Version

1.2.2

Access Contact

Project Contact

Creator Contact

Documentation

Source code

Publications

Snelders, S, Huijnen, P, Verheul, J, de Rijke, M and Pieters. T. 2017. A Digital Humanities Approach to the History of Culture and Science: Drugs and Eugenics Revisited in Early 20th-Century Dutch Newspapers, Using Semantic TextMining. In:Odijk, J and van Hessen, A. (eds.) CLARIN in the Low Countries, Pp. 325–336. London: Ubiquity Press. DOI: https://doi.org/10.5334/bbi.27. License: CC-BY 4.0

Resource

CMDI File Link

License

Apache License

Inventory Scope