This is a new version of the repository. Do let us know (lindat-help at ufal.mff.cuni.cz) if you encounter any issues.
Please use the following text to cite this item or export to a predefined format:
Straka, Milan, 2024, CorPipe 23 multilingual CorefUD 1.2 model (corpipe23-corefud1.2-240906), LINDAT/CLARIAH-CZ digital library at the Institute of Formal and Applied Linguistics (ÚFAL), http://hdl.handle.net/11234/1-5673.
dc.contributor.authorStraka, Milan
dc.date.accessioned2024-10-07T15:25:38Z
dc.date.available2024-10-07T15:25:38Z
dc.date.issued2024-09-06
dc.descriptionThe `corpipe23-corefud1.2-240906` is a `mT5-large`-based multilingual model for coreference resolution usable in CorPipe 23 <https://github.com/ufal/crac2023-corpipe>. It is released under the CC BY-NC-SA 4.0 license. The model is language agnostic (no corpus id on input), so it can be in theory used to predict coreference in any `mT5` language. However, the model expects empty nodes to be already present on input, predicted by the https://www.kaggle.com/models/ufal-mff/crac2024_zero_nodes_baseline/. This model was present in the CorPipe 24 paper as an alternative to a single-stage approach, where the empty nodes are predicted joinly with coreference resolution (via http://hdl.handle.net/11234/1-5672), an approach circa twice as fast but of slightly worse quality.
dc.identifier.urihttp://hdl.handle.net/11234/1-5673
dc.language.isocat
dc.language.isoces
dc.language.isochu
dc.language.isodeu
dc.language.isoeng
dc.language.isospa
dc.language.isofra
dc.language.isogrc
dc.language.isohbo
dc.language.isohun
dc.language.isolit
dc.language.isonob
dc.language.isonno
dc.language.isopol
dc.language.isorus
dc.language.isotur
dc.publisherCharles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
dc.relation.isreferencedbyhttps://arxiv.org/abs/2410.02756
dc.rightsCreative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.labelPUB
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/
dc.source.urihttps://github.com/ufal/crac2023-corpipe
dc.subjectcoreference resolution
dc.subjectCorPipe
dc.subjectCorefUD
dc.titleCorPipe 23 multilingual CorefUD 1.2 model (corpipe23-corefud1.2-240906)
dc.typetoolService
local.brandingLINDAT / CLARIAH-CZ
local.contact.personMilan Straka straka@ufal.mff.cuni.cz Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL)
local.files.count1
local.files.size1950408349
local.has.filesyes
local.language.nameCatalan
local.language.nameCzech
local.language.nameChurchSlavic
local.language.nameGerman
local.language.nameEnglish
local.language.nameSpanish
local.language.nameFrench
local.language.nameAncientGreek (to 1453)
local.language.nameAncient Hebrew
local.language.nameHungarian
local.language.nameLithuanian
local.language.nameNorwegianBokmål
local.language.nameNorwegian Nynorsk
local.language.namePolish
local.language.nameRussian
local.language.nameTurkish
local.sponsornationalFunds GX20-16819X Grantová agentura České republiky LUSyD – Language Understanding: from Syntax to Discourse
metashare.ResourceInfo#ContentInfo.detailedTypetool
metashare.ResourceInfo#ResourceComponentType#ToolServiceInfo.languageDependenttrue
 Files in this item
Name
corpipe23-corefud1.2-240906.zip
Size
1.82 GB
Format
application/zip
Description
A multilingual coreference resolution model trained on CorefUD 1.2 based on `mT5-large` usable in CorPipe 23 <https://github.com/ufal/crac2023-corpipe>.
MD5
c27f81ef8b998588a0a79e80b05140ec
Preview
  File Preview