This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.
 

KUKY1.0

Please use the following text to cite this item or export to a predefined format:
Cinková, Silvie; et al., 2024, KUKY1.0, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-5812.
Date issued
2024-12-31
Size
224 texts,
374251 tokens
Language(s)
Description
KUKY is a curated selection of 224 Czech administrative and legal documents for readability research, stored in two JSON files. The documents come partly from public databases (Office of the Ombudsman, courts) and from private sources (letters, public local administration announcements). Some documents come in documented draft-revision pairs. They are manually enriched with a two-level annotation: "Relevance Stoplight" and "Speech Acts". This annotation mimics the way a plain-language expert scrutinizes a document before redesigning it for better readability: first, they closely read the entire document and detect problematic passages ("Relevance Stoplight"), classifying them as either incomprehensible or superfluous, or approving them as relevant. In a second step, the editor works with the relevant text according to a genre-specific template ("Speech Acts"). At the metadata level, the documents are graded with respect to their readability, as perceived by experienced plain legal writing teachers.
Acknowledgement
 Files in this item
Name
argumentative.json
Size
4.5 MB
Format
application/octet-stream
Description
Unknown
MD5
1ad713fb72036d62db3c6ce39939c5d8
Preview
  File Preview
Name
normative.json
Size
2.66 MB
Format
application/octet-stream
Description
Unknown
MD5
b1f88b835790598a09991f424714c678
Preview
  File Preview
Name
KUKY_1_0-documentation.html
Size
2.06 MB
Format
text/html
Description
HTML
MD5
8f94503c492f3a72834c677612d309d4
Preview
  File Preview