The collection consists of queries and documents provided by the Qwant search Engine (https://www.qwant.com). The queries, which were issued by the users of Qwant, are based on the selected trending topics. The documents in the collection are the webpages which were selected with respect to these queries using the Qwant click model. Apart from the documents selected using this model, the collection also contains randomly selected documents from the Qwant index.
The collection serves as the official test collection for the 2023 LongEval Information Retrieval Lab (https://clef-longeval.github.io/) organised at CLEF. The collection contains test datasets for two organized sub-tasks: short-term persistence (sub-task A) and long-term persistence (sub-task B). The data for the short-term persistence sub-task was collected over July 2022 and this dataset contains 1,593,376 documents and 882 queries. The data for the long-term persistence sub-task was collected over September 2022 and this dataset consists of 1,081,334 documents and 923 queries. Apart from the original French versions of the webpages and queries, the collection also contains their translations into English.
The collection consists of queries and documents provided by the Qwant search Engine (https://www.qwant.com). The queries, which were issued by the users of Qwant, are based on the selected trending topics. The documents in the collection were selected with respect to these queries using the Qwant click model. Apart from the documents selected using this model, the collection also contains randomly selected documents from the Qwant index. All the data were collected over June 2022. In total, the collection contains 672 train queries, with corresponding 9656 assessments coming from the Qwant click model, and 98 heldout queries. The set of documents consist of 1,570,734 downloaded, cleaned and filtered Web Pages. Apart from their original French versions, the collection also contains translations of the webpages and queries into English. The collection serves as the official training collection for the 2023 LongEval Information Retrieval Lab (https://clef-longeval.github.io/) organised at CLEF.
Netgraph is a graphically oriented client-server application for searching in linguistically annotated treebanks. The query language of Netgraph is simple and intuitive, yet powerful enough for treebanks with complex annotations schemes. The primary purpose of Netgraph is searching in the Prague Dependency Treebank 2.0, nevertheless it can be used for other treebanks as well.
System for querying annotated treebanks in PML format. The querying uses it own query language with graphical representation. It has two different implementations (SQL and Perl) and several clients (TrEd, browser-based, command line interface).
This component integrates other VIADAT modules; together with VIADAT-REPO this composes the Virtual Assistant for accessing historical audiovisual data.
The zip archive contains sources for the following modules: VIADAT, VIADAT-DEPOSIT, VIADAT-TEXT, VIADAT-ANNOTATE, VIADAT-ANALYZE, VIADAT-STAT, VIADAT-GIS and VIADAT-SEARCH.
Developed in cooperation with ÚSD AV ČR and NFA.
This component integrates other VIADAT modules; together with VIADAT-REPO this composes the Virtual Assistant for accessing historical audiovisual data.
The zip archive contains sources for the following modules: VIADAT, VIADAT-DEPOSIT, VIADAT-TEXT, VIADAT-ANNOTATE, VIADAT-ANALYZE, VIADAT-STAT, VIADAT-GIS and VIADAT-SEARCH.
Developed in cooperation with ÚSD AV ČR and NFA.
A VIADAT module; VIADAT-ANALYZE is an interactive environment that enables the end user to work with stored, annotated and indexed audio recordings. Allowing visualization and extraction of results.
Developed in cooperation with ÚSD AV ČR and NFA.
A VIADAT module; VIADAT-ANALYZE is an interactive environment that enables the end user to work with stored, annotated and indexed audio recordings. Allowing visualization and extraction of results.
Developed in cooperation with ÚSD AV ČR and NFA.