+ "description": "<p>Flavour-tagging — the task of identifying heavy flavor jets — is essential for many physics analyses at the ATLAS experiment. This dataset, released for public use, can be used to train and evaluate machine learning models for jet flavour-tagging, as described in <a href=\"https://arxiv.org/abs/2505.19689\">arXiv:2505.19689</a>. It aims broaden interest and further development of innovative machine learning techniques to improve flavour-tagging performance.</p>\n<p>The dataset consists of approximately 50 million events from simulated top quark pair production at a centre-of-mass energy of 13.6 TeV. It is stored in HDF5 format and contains structured event-level, jet-level, track-level and truth hadron information. This dataset is designed to be compatible with the flavour-tagging algorithm development pipeline used at ATLAS, and is supported by accompanying instructions and example configurations provided in open-source repositories.</p>\n<p>To improve usability, the dataset is split into three mutually exclusive HDF5 files:</p>\n<ul>\n<li><code>mc-flavtag-ttbar-small.h5</code> — ~1.36 million events (~5.6 million jets)</li>\n<li><code>mc-flavtag-ttbar-medium.h5</code> — ~6.23 million events (~25.6 million jets)</li>\n<li><code>mc-flavtag-ttbar-large.h5</code> — ~41.1 million events (~168 million jets)</li>\n</ul>\n<p>Downloading all three files will provide access to the complete dataset. The smaller subsets are useful for quick exploration or prototyping workflows.</p>"
0 commit comments