HaCzech: Dataset of Handwritten Czech
This is a searchable version of the HaCzech: Dataset of Handwritten Czech corpus, as available via the LINDAT repository.
The dataset of handwritten Czech text lines, sourced from two chronicles (municipal chronicles 1931-1944, school chronicles 1913-1933). The searchable corpus contains only the 2000 manually transcribed lines from the 25k total lines. The JSON transcriptions were converted to TEI/XML, and then parsed using UDPipe.