Housekeeping#
Archiving superseded or obsolete outputs#
To ensure that precious storage resources are available for everybody (and other LHCb activities!) when needed, Analysis Productions dataset owners and WG liaisons should archive samples that are:
no longer used, as in obsolete or superseded by a newer version of the sample,
no longer being actively analysed.
For published analyses, a paper number can be assigned to samples, indicating that they should be preserved on tape for posterity. When marking samples as archived, the data are considered safe to remove from disk storage.
Why not keep everything on disk all the time?#
Nominally, after all samples produced are ready, the files are available “on disk” on LHCb EOS for immediate reading. This is the most convenient and accessible state for analysis. When analysis activities are discontinued on given sample(s), the disk space should be freed for other activities. Should analysis activities resume on the sample, it is always possible to reproduce that sample from the original request files that were committed to the data package repository, or, if the sample is preserved on tape, it can be re-staged on request.
So, there always remains a relatively quick path to access or reproduce the archived samples, if needed. This is made possible as all analysis productions ever submitted are preserved in the AnalysisProductions data package: it’s as easy as checking out the files of the corresponding git tag, opening a new merge request, and submitting the request, as with any other!
OK! I have samples that I don’t need anymore. How do I archive them?#
Archival of samples can be performed on the Analysis Productions webpage.
Important
Archiving dataset(s) will preserve the production metadata, and will not immediately delete data from disk. But this operation will flag the samples as safe to remove, should the LHCb data management team need to free the space occupied by them.
Navigate to your production and click Select to begin selecting datasets.
Click on the table row of each sample you would like to archive.
Click Finalise selection at the top of the table.
Check your selection carefully! Then, select Archive, or you can set an archival further in the future, if you still need the samples for a little while longer.
The web page should now confirm whether the operation succeeded or failed, and the archived datasets will disappear from the analysis table, or as per the specified time frame.
I’ve published my paper and don’t need my samples any longer. How do I preserve them?#
Nice work! Navigate to the Analysis Productions webpage and do the following:
Navigate to your production and click Select to begin selecting datasets.
Click on the table row of each sample you would like to archive.
Click Finalise selection at the top of the table.
Under the Assign to Publication heading, enter your e.g. PAPER, DP, or FIGURE number in the fields and click Assign.
Follow the dialog instructions and ensure the operation was successful.
That’s it!
Though it may no longer be kept on disk, assigning your sample a publication number will ensure that it is preserved and available from tape!
Reusing Samples#
Given the broad overlap in required samples for analyses it is possible that when collecting the data/MC for a new analysis many of the required tuples already exist in the Analysis Productions database. Therefore rather than retupling this data you can simply assign existing tuples to a new analysis. To do this:
Open the Analysis Productions webpage
Click Create new analysis.
Select the working group and a name for your production.
Use the filter tabs to filter the shown samples.
Select any samples you wish to add to your new production.
Click Add N samples.
If you would like to assign samples to more than one WG, you can do so by creating a new analysis as above for each new WG, and selecting each sample you would like to include each time.
You may additionally add samples from an existing production to another as follows:
Open the Analysis Productions webpage
Navigate to the production you want to add samples to.
Click Add samples
Use the filter tabs to filter the shown samples.
Select any samples you wish to add to your production.
Click Add N samples.
Instructions for liaisons#
- Review the merge requests for your working group once assigned by the request author.
Size of the output file doesn’t seem excessive.
The analysis isn’t including too many variables in the ntuple.
Communication of guidance on good practices for productions (particularly Run 3).
Encourage people to share productions between analyses where practical.
There is no need to be too strict, it’s more important that productions are submitted promptly, and that issues are quickly understood and resolved.
Approve and merge! N.B. This requires membership in the lhcb-dpa-emtf-rta-liaisons egroup. You are responsible for subscribing to it yourself.
- When the CI pipeline finishes, ensure that an issue was automatically created here.
The title should be “Productions for WG: ANALYSIS”.
It will show up on the merge request as a linked issue, and the merge request will also be linked to that issue.