WIP: War in Parliament
SummaryAn advanced search engine for the OCR-ed scanned image collection of proceedings of the Dutch Hansard (Handelingen der Staten-Generaal 1930-1995). These proceedings are available as a fully annotated semi-structures dataset for historical and social science research. The output of the search engine can be restricted by speaker name, party, date range, and other criteria.
Background
References to the Second World War (WW II) have shaped political debate in the Netherlands for many decades. However, we have no systematic knowledge of why, how often, when, by whom or from which political party, and in which context, these references were made. Nor do we know the meanings politicians ascribed to the war years, the lessons the war was supposed to teach, and how all of this influenced political decision-making. WIP helps answering these questions and will help us better understand the complex legacies of WW II.
The WIP project bridges the gap between historical and social science practices and the possibilities offered by using large corpora and language resources, in particular Clarin tools for Dutch. The dataset - de Handelingen der Staten-Generaal (Dutch Hansard) - are made compliant with Clarin, ISOCAT and ISO/TC 37/SC 4 standards. The search engine for this dataset uses an intuitive and powerful query language based on XPath, and its output can be fed directly into further analysis programs like SPSS. Integrating this technology with important historical research questions will
directly contribute to new and innovative ways of writing about history.
The search engine results can be exported in a CSV-format (comma seperated values). This makes it easy to calculate statistics offline from a result set and apply further filters.
- Project leader: Dr. Hinke Piersma (NIOD)
- CLARIN center: DANS
- Help contact : maartenmarx@uva.nl
- Web-sites: http://wip.politicalmashup.nl/
- User scenario's (screencast): http://youtu.be/tEqAIH2o2rc demo: http://dev.clarin.nl/sites/default/files/WIP%20demoscenario.pdf
- Manual: http://dev.clarin.nl/sites/default/files/WIP%20manual.pdf
- Tool/Service link: http://wip.politicalmashup.nl/
- Publications:
- Marx, M. (2011), Oorlog in de Kamer, NRC, March 3, 2011
- ‘Waarom politici graag over de oorlog praten’, NRC-Handelsblad, 25 februari 2011
- ‘Zoekmachine vindt relevante WO2-verwijzingen in Handelingen der Staten Generaal. Dat doet denken aan de oorlog’, in: E-data & research, Jaargang 6, nummer 2, oktober 2011 http://www.edata.nl/0602_011011/pdf/0602_011011_1.pdf
- ‘NIOD ontwikkelt zoekmachine die verwijzingen naar de oorlog opspoort’, in: Informatie Professional. Vakblad voor informatiewerkers, nr. 11 (2011)
- L. Buitinck en M. Marx (2012), ‘Two-stage named entity recognition using averaged perceptrons’, in: Proc. NLDB 2012, pp. 171-176