Title
    GrETEL Search Engine for Querying Syntactic Constructions in Treebanks  
  Description
     
				GrETEL Version 4 extends the functionality of GrETEL in a number of ways: 
				
				(1) it allows uploading one's own corpus, either a parsed corpus in LASSY DTD format, or a text corpus in a variety of formats (currently supported: plain text, CHAT; FoLiA and TEI to follow). An uploaded unparsed  corpus will be automatically parsed by Alpino and made available in the search interface.
				(2) It also enables  upload of metadata that are contained in the data, e.g. in accordance with the PaQu metadata format
				(3) Checking of XPath expressions and suggestions for selecting valid attributes and values have been added, as have the use of macros (a la PaQu)
				(4) filtering on the basis of metadata has been made possible
				(5) full analysis of the metadta in combination with nodes and their properties from the syntactic structures is available, allowing a user to select, group, sort, and filter data and metadata in combination, through a pivot table with draggable attributes and node properties. Sums , counts, heatmaps and other graphical visualisation enable rapid analysis of the data.
				
				GrETEL 4 is being developed by Utrecht University (UiL-OTS and Digital Humanities Lab) in the context of the AnnCor and CLARIAH-CORE projects. It based on GrETEL 3 developed by KU Leuven.
				
				     GrETEL is a query engine in which linguists can use a natural language example as a starting point for searching a treebank with limited knowledge about tree representations and formal query languages. Instead of a formal search instruction, it takes a natural language example as input. This provides a convenient way for novice and non-technical users to use treebanks with a limited knowledge of the underlying syntax and formal query languages. By allowing linguists to search for constructions similar to the example they provide, it aims to bridge the gap between descriptive-theoretical and computational linguistics.
     The example-based query procedure consists of six steps. In the first step the user enters an example of the construction he/she is interested in. In the second step the example sentence is automatically parsed with the Alpino parser, resulting in a parse tree. In the third step the example is returned in the form of a matrix, in which the user specifies which aspects of this example are essential for the construction under investigation. Based on the information indicated in the matrix, a query tree is cut out of the initial parse tree and automatically converted to an XPath query. By clicking the advanced options button, the XPath query is made visible, and can be edited if desired. This query will be used for the actual treebank search.
     In the fourth step the user can select which treebank needs to be queried. The fifth step provides an overview of the search instruction, i.e. the input sentence, the selected treebank(s), the query tree and the (adapted) XPath query. In the sixth step the query is executed on the selected corpus. The matching constructions are presented to the user as a list of sentences, which can be downloaded. The user can also click on the sentences in order to visualize the results as syntax trees.
     In addition to the example-based search functionality, users can also query the treebanks using an XPath expression. This query is then processed in the same way as the automatically generated query in the example-based approach.
				    
  Project
    AnnCor  
  
    CLARIAH-CORE  
  CLARIN National Project
CLARIN centre
    none yet  
  Research domain
Linguistic Subject
Tool task
Country
    Netherlands  
  Tool Type
Research Phase
Tool status
Input format
Output format
Input Language
Version
    4.0  
  Access Contact
Project Contact
Creator Contact
    Martijn van der Klis  
  
    Sheean Spoel  
  
    Gerson Foks  
  Documentation
Source code
Original source
Publications
    Augustinus, L, Vandeghinste, V, Schuurman, I and Van Eynde, F. 2017. GrETEL: A Tool for Example-Based Treebank Mining. In: Odijk, J and van Hessen, A. (eds.) CLARIN in the Low Countries, Pp. 269–280. London: Ubiquity Press. DOI: https://doi.org/10.5334/bbi.22. License: CC-BY 4.0  
  
    Jan Odijk, Martijn van der Klis and Sheean Spoel. 2018. Extensions to the GrETEL Treebank Query Application, in Proceedings of TLT16, 23-24 Jan 2018, Prague, Czech Republic
    	     
  Resource
CMDI File Link
License
    CC-BY-NC-SA  
  
        