This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.

CorPipe 24 Multilingual CorefUD 1.2 Model (corpipe24-corefud1.2-240906)

Please use the following text to cite this item or export to a predefined format:
Straka, Milan, 2024, CorPipe 24 Multilingual CorefUD 1.2 Model (corpipe24-corefud1.2-240906), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-5672.
Date issued
2024-09-06
Description
The `corpipe24-corefud1.2-240906` is a `mT5-large`-based multilingual model for coreference resolution usable in CorPipe 24 (https://github.com/ufal/crac2024-corpipe). It is released under the CC BY-NC-SA 4.0 license. The model is language agnostic (no corpus id on input), so it can be in theory used to predict coreference in any `mT5` language. This model jointly predicts also the empty nodes needed for zero coreference. The paper introducing this model also presents an alternative two-stage approach first predicting empty nodes (via https://www.kaggle.com/models/ufal-mff/crac2024_zero_nodes_baseline/) and then performing coreference resolution (via http://hdl.handle.net/11234/1-5673), which is circa twice as slow but slightly better.
Acknowledgement
 Files in this item
Name
corpipe24-corefud1.2-240906.zip
Size
1.92 GB
Format
application/zip
Description
A multilingual coreference resolution model trained on CorefUD 1.2 based on `mT5-large` usable in CorPipe 24 <https://github.com/ufal/crac2024-corpipe>.
MD5
9525437e590b36187c0e5d095ffdfd69
Preview
  File Preview