Czech Proofreading Rules
Please use the following text to cite this item or export to a predefined format:
Hlaváčková, Dana; Machura, Jakub; Žižková, Hana; Kovář, Vojtěch and Nevěřilová, Zuzana, 2025,
Czech Proofreading Rules, LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-6001.
Authors
Item identifier
Project URL
Referenced by
Date issued
2025-10-19
Size
6649 entries,
175 classes
Description
The collection describes proofreading errors in Czech covered by Opravidlo 1.0. It consists of:
- the grammar rules applicable via the SET Czech syntactic parser
- description of grammar rules with relation to ERRANT codes
- extended ERRANT ontology, created from the original ERRANT [Bryant et al., 2017] and its Czech extension [Náplava et al., 2022]
- Python script that demonstrates how to apply the SET rules to proofreading
The dataset contains 6649 SET rules in main categories: agreement, capitals, commas, dependent clauses, non-grammatical structures, pronouns, spelling complex, and others. The error categories form a taxonomy with Czech and English descriptions, examples, and links to ERRANT codes, 175 classes in total.
Acknowledgement
European Commission
Project code:EC/HORIZON-RIA/101129751/EU
Project name:O.S.C.A.R.S. - Open Science Clusters’ Action for Research and Society
Subject(s)
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- opravidlo_rules.zip.zip
- Size
- 513.65 KB
- Format
- application/zip
- Description
- All data in one bundle.
- MD5
- 1ecd9391616485d9795b2e9e59e088ec

-
- other2.set.csv567 B
- capitals.set.csv1 kB
- README.md7 kB
- errant_extended_vocabulary.rdf72 kB
- pronouns.set.csv2 kB
- other.set19 kB
- commas.set.csv490 B
- commas_morphodita.set412 kB
- kdybysem.png21 kB
- agreement.set168 kB
- pronouns.set668 kB
- spelling_complex.set44 kB
- agreement.set.csv2 kB
- nongramatical_structures.set.csv424 B
- apply_rules.py2 kB
- spelling_complex.set.csv609 B
- capitals.set488 kB
- other2.set3 kB
- nongramatical_structures.set17 kB
- commas.set205 kB
- dependent_clauses.set23 kB
- other.set.csv3 kB
- dependent_clauses.set.csv1 kB
- commas_morphodita.set.csv307 B
- Name
- errant_extended_vocabulary.rdf
- Size
- 72.52 KB
- Format
- application/rdf+xml; charset=utf-8
- Description
- Ontology of errors extends ERRANT [Bryant et al., 2017] and Czech ERRANT [Náplava et al., 2022].
- MD5
- 4d9c1208b59e9e65b1a9e2be2a04c94a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- apply_rules.py
- Size
- 2.22 KB
- Format
- application/octet-stream
- Description
- Example script showing how to use SET syntactic parser for Czech, together with the rules.
- MD5
- 4e2d1a1df603c0e660d19c0b57ba9463

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- agreement.set
- Size
- 168.32 KB
- Format
- application/octet-stream
- Description
- Grammar rules for SET - syntactic parser for Czech.
- MD5
- 3f32d20766c5256364c730e4a73d3063

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- capitals.set
- Size
- 488.52 KB
- Format
- application/octet-stream
- Description
- Grammar rules for SET - syntactic parser for Czech.
- MD5
- 6e68843e857b6c8bb5f8e3ee515cbf1b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- commas.set
- Size
- 205.12 KB
- Format
- application/octet-stream
- Description
- Grammar rules for SET - syntactic parser for Czech.
- MD5
- 852b3ea3e85817684e9f592d072d1811

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- commas_morphodita.set
- Size
- 412.48 KB
- Format
- application/octet-stream
- Description
- Grammar rules for SET - syntactic parser for Czech.
- MD5
- 024b5e50d0079f955f9f3c6fb281be1f

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- dependent_clauses.set
- Size
- 23.17 KB
- Format
- application/octet-stream
- Description
- Grammar rules for SET - syntactic parser for Czech.
- MD5
- e8dd6770f7bf72032dd71713a4d77abb

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- nongramatical_structures.set
- Size
- 17.17 KB
- Format
- application/octet-stream
- Description
- Grammar rules for SET - syntactic parser for Czech.
- MD5
- 34db75388caf5e040f197522542deb16

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- other.set
- Size
- 19.23 KB
- Format
- application/octet-stream
- Description
- Grammar rules for SET - syntactic parser for Czech.
- MD5
- d8b95e73218866fe28590b23ca231a48

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- other2.set
- Size
- 3.21 KB
- Format
- application/octet-stream
- Description
- Grammar rules for SET - syntactic parser for Czech.
- MD5
- cd444aaacd9cc1a324fcd9192d36fd47

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- pronouns.set
- Size
- 668.7 KB
- Format
- application/octet-stream
- Description
- Grammar rules for SET - syntactic parser for Czech.
- MD5
- 28139db82376e1043154e7a038f9692b

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- spelling_complex.set
- Size
- 44.36 KB
- Format
- application/octet-stream
- Description
- Grammar rules for SET - syntactic parser for Czech.
- MD5
- 31dcf401b221ab429c0ae4676be0a6e9

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- agreement.set.csv
- Size
- 2.81 KB
- Format
- text/csv
- Description
- Description of the rules categories, with links to the ERRANT ontology.
- MD5
- 0a6c4c4f4cb8a682ba63b8c102eb1fbc

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- capitals.set.csv
- Size
- 1.67 KB
- Format
- text/csv
- Description
- Description of the rules categories, with links to the ERRANT ontology.
- MD5
- f96c06ed292be7d21943c955c3ceab1d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- commas.set.csv
- Size
- 490 B
- Format
- text/csv
- Description
- Description of the rules categories, with links to the ERRANT ontology.
- MD5
- 79d67d6f48e4376b8e947b879ff8418e

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- commas_morphodita.set.csv
- Size
- 307 B
- Format
- text/csv
- Description
- Description of the rules categories, with links to the ERRANT ontology.
- MD5
- 255c971ce2453644566468a1d7c0e82a

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- dependent_clauses.set.csv
- Size
- 1.54 KB
- Format
- text/csv
- Description
- Description of the rules categories, with links to the ERRANT ontology.
- MD5
- f82163edc47a89a243da6b0b557d6cb9

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- nongramatical_structures.set.csv
- Size
- 424 B
- Format
- text/csv
- Description
- Description of the rules categories, with links to the ERRANT ontology.
- MD5
- 6ee9ec01a4d5e235a1b8ab8f18b596e1

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- other.set.csv
- Size
- 3.13 KB
- Format
- text/csv
- Description
- Description of the rules categories, with links to the ERRANT ontology.
- MD5
- 98504a7a53c665538adaec9392ffd7cd

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- other2.set.csv
- Size
- 567 B
- Format
- text/csv
- Description
- Description of the rules categories, with links to the ERRANT ontology.
- MD5
- 7d8fb38a15466defb1c703fb7f52b4af

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- pronouns.set.csv
- Size
- 2.99 KB
- Format
- text/csv
- Description
- Description of the rules categories, with links to the ERRANT ontology.
- MD5
- 15a13098d642b6aade728b9034cb6fc2

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- spelling_complex.set.csv
- Size
- 609 B
- Format
- text/csv
- Description
- Description of the rules categories, with links to the ERRANT ontology.
- MD5
- 7bb26ab7c469e31f1c0d004ff5765ee6

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- kdybysem.png
- Size
- 21.54 KB
- Format
- image/png
- Description
- MD5
- f9d6756f3759bcc60875223462d10df3

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz
- Name
- README.md
- Size
- 7.04 KB
- Format
- application/octet-stream
- Description
- How to use the rules.
- MD5
- 35a896103233a62bf09e6e7b53ff908d

The file preview has not been generated yet. Please try again later or contact the system administrator lindat-help@ufal.mff.cuni.cz

