Please use the following text to cite this item or export to a predefined format:
Straka, Milan, 2024,
CorPipe 23 multilingual CorefUD 1.2 model (corpipe23-corefud1.2-240906), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL),
http://hdl.handle.net/11234/1-5673.
| dc.contributor.author | Straka, Milan |
| dc.date.accessioned | 2024-10-07T15:25:38Z |
| dc.date.available | 2024-10-07T15:25:38Z |
| dc.date.issued | 2024-09-06 |
| dc.description | The `corpipe23-corefud1.2-240906` is a `mT5-large`-based multilingual model for coreference resolution usable in CorPipe 23 <https://github.com/ufal/crac2023-corpipe>. It is released under the CC BY-NC-SA 4.0 license. The model is language agnostic (no corpus id on input), so it can be in theory used to predict coreference in any `mT5` language. However, the model expects empty nodes to be already present on input, predicted by the https://www.kaggle.com/models/ufal-mff/crac2024_zero_nodes_baseline/. This model was present in the CorPipe 24 paper as an alternative to a single-stage approach, where the empty nodes are predicted joinly with coreference resolution (via http://hdl.handle.net/11234/1-5672), an approach circa twice as fast but of slightly worse quality. |
| dc.identifier.uri | http://hdl.handle.net/11234/1-5673 |
| dc.language.iso | cat |
| dc.language.iso | ces |
| dc.language.iso | chu |
| dc.language.iso | deu |
| dc.language.iso | eng |
| dc.language.iso | spa |
| dc.language.iso | fra |
| dc.language.iso | grc |
| dc.language.iso | hbo |
| dc.language.iso | hun |
| dc.language.iso | lit |
| dc.language.iso | nob |
| dc.language.iso | nno |
| dc.language.iso | pol |
| dc.language.iso | rus |
| dc.language.iso | tur |
| dc.publisher | Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
| dc.relation.isreferencedby | https://arxiv.org/abs/2410.02756 |
| dc.rights | Creative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) |
| dc.rights.label | PUB |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ |
| dc.source.uri | https://github.com/ufal/crac2023-corpipe |
| dc.subject | coreference resolution |
| dc.subject | CorPipe |
| dc.subject | CorefUD |
| dc.title | CorPipe 23 multilingual CorefUD 1.2 model (corpipe23-corefud1.2-240906) |
| dc.type | toolService |
| local.branding | LINDAT / CLARIAH-CZ |
| local.contact.person | Milan Straka straka@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL) |
| local.files.count | 1 |
| local.files.size | 1950408349 |
| local.has.files | yes |
| local.language.name | Catalan |
| local.language.name | Czech |
| local.language.name | ChurchSlavic |
| local.language.name | German |
| local.language.name | English |
| local.language.name | Spanish |
| local.language.name | French |
| local.language.name | AncientGreek (to 1453) |
| local.language.name | Ancient Hebrew |
| local.language.name | Hungarian |
| local.language.name | Lithuanian |
| local.language.name | NorwegianBokmål |
| local.language.name | Norwegian Nynorsk |
| local.language.name | Polish |
| local.language.name | Russian |
| local.language.name | Turkish |
| local.sponsor | nationalFunds GX20-16819X Grantová agentura České republiky LUSyD – Language Understanding: from Syntax to Discourse |
| metashare.ResourceInfo#ContentInfo.detailedType | tool |
| metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependent | true |
Collections
This item isPublicly Available
and licensed under:
Files in this item
- Name
- corpipe23-corefud1.2-240906.zip
- Size
- 1.82 GB
- Format
- application/zip
- Description
- A multilingual coreference resolution model trained on CorefUD 1.2 based on `mT5-large` usable in CorPipe 23 <https://github.com/ufal/crac2023-corpipe>.
- MD5
- c27f81ef8b998588a0a79e80b05140ec

- corpipe23-corefud1.2-240906
- LICENSE20 kB
- README.md5 kB
- options.json2 kB
- model.h52 GB
- tags.txt1 kB

