KAT5: Knowledge-Aware Text-to-Text Transformer Transformer

Knowledge-Aware Text-to-Text Transformer Transformer We introduce knowledge-aware transfer learning with a text-to-text transfer transformer (KAT5) by leveraging a text-to-text transfer transformer (T5) in the Wikipedia domain. In standard transfer learning like T5, a model is first pre-trained on an unsupervised data task with a language model objective before fine-tuning it on a downstream task. T5 explores several learning objectives, including masked language model (MLM), random span, and deshuffling, where the model is limited to exploring integrating knowledge during pre-training. Here, we push the limits of this model by grafting knowledge like entity and co-reference information by mapping Wikipedia and Wikidata during pre-training. We align large-scale alignments between Wikipedia abstract and Wiki-data triples to facilitate our pre-training KAT5 model. Our approach can match or outperform task-specific models while using the same architecture and hyper-parameters, in particular in entity and relation extraction (CoNLL04, ADE, and NYT datasets), and language generation tasks, including abstractive summarization (XSum, CNNDM), and machine translation.

Knowledge-aware Pre-training Data Creation Architecture

KAT5 Pre-training Architecture

KAT5 Fine-tuning Architecture

Scripts

Our scripts a-job-pretraining.sh, and a-job-finetuning.sh were mainly prepared for launching multi-node training and fine-tuning on the ABCI computation cluster.

Publications

Mohammad Golam Sohrab, Makoto Miwa (2024). KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer. In: Bifet, A., Krilavičius, T., Miliou, I., Nowaczyk, S. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2024. Lecture Notes in Computer Science(), vol 14949. Springer, Cham. https://doi.org/10.1007/978-3-031-70378-2_10

@InProceedings{10.1007/978-3-031-70378-2_10,
author="Sohrab, Mohammad Golam
and Miwa, Makoto",
editor="Bifet, Albert
and Krilavi{\v{c}}ius, Tomas
and Miliou, Ioanna
and Nowaczyk, Slawomir",
title="KAT5: Knowledge-Aware Transfer Learning with a Text-to-Text Transfer Transformer",
booktitle="Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track",
year="2024",
publisher="Springer Nature Switzerland",
address="Cham",
pages="157--173",
abstract="We introduce knowledge-aware transfer learning with a text-to-text transfer transformer (KAT5) by leveraging a text-to-text transfer transformer (T5) in the Wikipedia domain. In standard transfer learning like T5, a model is first pre-trained on an unsupervised data task with a language model objective before fine-tuning it on a downstream task. T5 explores several learning objectives, including masked language model (MLM), random span, and deshuffling, where the model is limited to exploring integrating knowledge during pre-training. Here, we push the limits of this model by grafting knowledge like entity and co-reference information by mapping Wikipedia and Wikidata during pre-training. We align large-scale alignments between Wikipedia abstract and Wikidata triples to facilitate our pre-training KAT5 model. Our approach can match or outperform task-specific models while using the same architecture and hyper-parameters, in particular in entity and relation extraction (CoNLL04, ADE, and NYT datasets), and language generation tasks, including abstractive summarization (XSum, CNNDM), and machine translation. Our code is publicly released on GitHub (https://github.com/aistairc/kat5) under the Apache 2.0 License.",
isbn="978-3-031-70378-2"
}

Acknowledgment

This research is based on results obtained from a project JPNP20006, commissioned by the New Energy and Industrial Technology Development Organization (NEDO).

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
results		results
LICENSE		LICENSE
README.md		README.md
a-job-finetuning.sh		a-job-finetuning.sh
a-job-pretraining.sh		a-job-pretraining.sh
kat5_finetuning.png		kat5_finetuning.png
kat5_finetuning.py		kat5_finetuning.py
kat5_pretraining.png		kat5_pretraining.png
kat5_pretraining.py		kat5_pretraining.py
trexm.png		trexm.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

KAT5: Knowledge-Aware Text-to-Text Transformer Transformer

Knowledge-aware Pre-training Data Creation Architecture

KAT5 Pre-training Architecture

KAT5 Fine-tuning Architecture

Scripts

Publications

Acknowledgment

About

Uh oh!

Releases

Packages

Languages

License

aistairc/kat5

Folders and files

Latest commit

History

Repository files navigation

KAT5: Knowledge-Aware Text-to-Text Transformer Transformer

Knowledge-aware Pre-training Data Creation Architecture

KAT5 Pre-training Architecture

KAT5 Fine-tuning Architecture

Scripts

Publications

Acknowledgment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages