The corpus contains video files of Czech Television News Broadcasts and JSON files with annotations of faces that appear in the broadcasts. The annotations are composed of frames in which a face is seen, name of the person whose face is seen, gender of the person (male/female), and the image region containing the face. The intended use of the corpus is to train models of faces for face detection, face identification, face verification, and face tracking. For convinience two different JSON files are provided. They contain the same data, but in different arrangements. One file has the identity of the person on the top, the other has the object ID on the top, where the object is a facetrack. A demo python skript is available for showing how to access the data.
This package contains an extended version of the test collection used in the CLEF eHealth Information Retrieval tasks in 2013--2015. Compared to the original version, it provides complete query translations into Czech, French, German, Hungarian, Polish, Spanish and Swedish and additional relevance assessment.