|
1 |
| -In 2022, LHCb has released the first 200 terabytes of the data via CERN OpenData Portal, making it available to the public. |
| 1 | +By the end of 2023, LHCb released all of its Run I data, via CERN Open Data Portal, to the general public. The data comes in `.DST` and `.MDST` format which is the same format used by LHCb internally. |
2 | 2 |
|
3 |
| -- The data comes in `.DST` and `.MDST` format, the same format is used by LHCb internally. |
4 | 3 | - Every data set released is narrated by an "Open Data Record" accessible through Open Data Portal.
|
5 |
| -- Open Data Records contain various bits of information about the selected data set (this is called metadata). An example of the types of metadata provided in the record is: |
6 |
| - - Number of events in the dataset |
7 |
| - - Number of files in the dataset |
8 |
| - - Combined size in TB of the dataset |
9 |
| - - Production ID |
10 |
| - - Production Type |
11 |
| - - Detector conditions (condb, dddb tags) |
12 |
| - - List of Trigger Configuration Keys (TCKs) |
13 |
| - - Scripts used for each production step |
14 |
| - - List of Logical File Names (LFNs) on [LHCb DIRAC](https://lhcb-dirac.readthedocs.io/en/latest/). |
15 |
| - |
16 |
| -The metadata provided should help the user to navigate, select and work with with LHCb Open Data. |
17 |
| - |
18 |
| -Index of files is accessible both via a GUI or as a machine readable file. |
19 |
| - |
20 |
| -- Some instructions on how to use open data are pointed out in the records themselves. |
21 |
| -- As well as the data records, an extensive list of LHCb stripping lines and their descriptions is provided as well. |
22 |
| -- After selecting the desired stream, a stripping line description can be followed to obtain a number of cuts/conditions which could be used to filter the data further. |
23 |
| -- Data can be accessed directly (eg. using [xrootd](https://xrootd.slac.stanford.edu/) protocol) or downloaded locally. |
24 |
| -- It is suggested to further filter and categorize the data by writing out smaller data files in `.root` format (called ntuples). |
25 |
| -- This is done in LHCb with the help of software called [DaVinci](https://lhcbdoc.web.cern.ch/lhcbdoc/davinci/). |
26 |
| -- 'DaVinci' and other LHCb Software is available through [CVMFS](https://cernvm.cern.ch/fs/). |
27 |
| -- Some initial instructions on working with DaVinci are provided in [LHCb Starterkit](https://lhcb.github.io/starterkit-lessons/first-analysis-steps/minimal-dv-job.html) web page. |
| 4 | +- Open Data Records contain various bits of information about the selected data set (this is called metadata). An example of the types of metadata provided in the record is: |
| 5 | + - Number of events in the dataset |
| 6 | + - Number of files in the dataset |
| 7 | + - Combined size in TB of the dataset |
| 8 | + - Production ID |
| 9 | + - Production Type |
| 10 | + - Detector conditions (condb, dddb tags) |
| 11 | + - List of Trigger Configuration Keys (TCKs) |
| 12 | + - Scripts used for each production step |
| 13 | + - List of Logical File Names (LFNs) on [LHCb DIRAC](https://lhcb-dirac.readthedocs.io/en/latest/). |
| 14 | + |
| 15 | +The [LHCb Open Data Guide](https://lhcb-opendata-guide.web.cern.ch/) and the metadata provided on the Open Data Portal should help the user to navigate, select, and work with LHCb Open Data. |
0 commit comments