YAML Configuration Sub-Keys#
Note
Keys marked with an asterisk are optional and can be omitted.
input#
bk_query#
Key |
Type |
Meaning |
|---|---|---|
|
string |
The bookkeeping location of the desired input data. |
|
int |
The number of files to use as input to test jobs. Only to be used for samples with very few output candidates. |
|
float |
The sampling fraction to use when sampling the input LFNs to use in
the production. For example, |
|
string |
The seed to use when sampling input LFNs. For example, |
|
sequence of strings |
What quality of data to use. This can be set to any of |
|
sequence of strings |
In addition to requiring data quality (DQ) |
|
sequence of integers |
A sequence of data taking runs to use as input. This can either be written as a typical sequence or as A:B where runs from A to B inclusive will be used. Cannot be used with start_run/end_run. |
|
int |
Filter the BK query output such that runs before this run number are excluded. Use with end_run, not with runs. |
|
int |
Filter the BK query output such that runs after this run number are excluded. Use with start_run, not with runs. |
|
string |
The input plugin setting, either |
|
bool |
Whether to keep running on new data as it comes in.
|
|
string |
Gas injected in SMOG2, possible choices are: [Hydrogen, Deuterium,
Helium, Nitrogen, Oxygen, Neon, Argon, Krypton, Xenon].
2 possible states: |
Here is a full example showing a bk_query input using all optional keys:
job_name:
input:
bk_query: /some/MagUp/bookkeeping/path.DST
n_test_lfns: 3
dq_flags:
- BAD
- OK
runs:
- 269370
- 269371
- 269372
# equivalent to 269370:269372
input_plugin: by-run
keep_running: True
smog2_state:
- Argon
- ArgonUnstable
# Alternative using start_run and end_run instead of runs
job_name_alt:
input:
bk_query: /some/MagUp/bookkeeping/path.DST
start_run: 269370
end_run: 269372
# This is equivalent to runs: ["269370:269372"]
job_name#
Key |
Type |
Meaning |
|---|---|---|
|
string |
The name of the job whose output should be the input of this job. |
|
string |
The file type of the input file, for when your input job has multiple output files. |
|
float |
The sampling fraction to use (0.0 to 1.0). |
|
string |
The seed to use for reproducible sampling. |
Here is a full example showing a job_name input using all optional keys:
strip_job:
bk_query: /some/MagUp/bookkeeping/path.DST
options: strip.py
tuple_job:
input:
job_name: strip_job
filetype: DST
options: tuple.py
transform_ids#
Key |
Type |
Meaning |
|---|---|---|
|
sequence of integers |
A sequence of transformation IDs to use as input file sources. |
|
string |
The file type of the input file, for when your input job has multiple output files. |
|
int |
The number of files to use as input to test jobs. Only to be used for samples with very few output candidates. |
|
sequence of strings |
What quality of data to use. This can be set to any of |
|
sequence of integers |
A sequence of data taking runs to use as input. This can either be written as a typical sequence or as A:B where runs from A to B inclusive will be used. |
|
bool |
Whether to keep running on new data as it comes in. |
|
float |
The sampling fraction to use (0.0 to 1.0). |
|
string |
The seed to use for reproducible sampling. |
Here is a full example showing a transform_ids input using all optional keys:
job_name:
input:
transform_ids:
- 1234
- 5678
filetype: DST
n_test_lfns: 3
dq_flags:
- BAD
- OK
runs:
- 269370
- 269371
- 269372
# equivalent to 269370:269372
options#
The options configuration defines how to execute the analysis job. There are two formats:
LbExec Options (Run 3+)#
For Run 3 and later applications using lbexec:
Example:
job_name:
...
options:
entrypoint: "my_production.script:my_job"
extra_options:
compression:
optimise_baskets: false
extra_args:
- "do_this_thing"
Legacy Options (Run 1/2)#
In general, for simple cases with just option files, you should use the shorthand:
job_name:
options:
- "data_options.py"
- "reco_options.py"
For Run 1 and Run 2 applications using gaudirun.py with options files:
Key |
Type |
Meaning |
|---|---|---|
|
list of strings |
List of Python options files for the application. |
|
list of strings |
(Optional) command to call. |
Example:
job_name:
options:
files:
- "data_options.py"
- "reco_options.py"
command:
- "gaudirun.py"
- "-T"
Additional Job Configuration Keys#
The following additional keys can be used to configure job behavior:
Auto-Configuration Fields#
These fields are used when automatically_configure is enabled, and in general you won’t need to set them yourself.
For run 3 applications you’ll need to use instead extra_options under options to set specific application options.
Production Metadata#
Examples:
my_job:
# Auto-configure overrides (run 2 only)
simulation: true
data_type: "2018"
dddb_tag: "dddb-20170721-3"
conddb_tag: "cond-20170724"
# Production metadata
comment: "High priority analysis job"
tags:
campaign: "2024_analysis"
priority: "urgent"
Job Recipes#
Recipes are predefined job configurations that can be used to simplify common analysis tasks. The recipe field allows you to specify a recipe that will automatically configure various job parameters.
Split Trees Recipe#
The split-trees recipe is used to split ROOT files based on key patterns, allowing you to separate different decay channels or data types into separate output files.
Key |
Type |
Meaning |
|---|---|---|
|
string |
Must be “split-trees”. |
|
list of objects |
List of splitting configurations, each with “key” and “into” fields. |
Each split configuration has:
Example:
split_job:
wg: B2CC
inform: [alice]
input:
bk_query: /some/path/data.ROOT
recipe:
name: "split-trees"
split:
- key: "Tuple_SpruceSLB_(Bc).*?/DecayTree"
into: "BC.ROOT"
- key: "Tuple_SpruceSLB_(Bu).*?/DecayTree"
into: "BU.ROOT"
Filter Trees Recipe#
The filter-trees recipe is used to apply filtering operations to ROOT trees.
Key |
Type |
Meaning |
|---|---|---|
|
string |
Must be “filter-trees”. |
|
string |
The filtering entrypoint in format “module:function”. |
Example:
filter_job:
wg: Charm
inform: [bob]
input:
bk_query: /some/path/raw_data.ROOT
recipe:
name: "filter-trees"
entrypoint: "MyAnalysis.filter_script:run_preselection"
Expand BK Path Recipe#
The expand recipe is used to expand a single job definition into multiple jobs by substituting variables in the bookkeeping path.
Key |
Type |
Meaning |
|---|---|---|
|
string |
Must be “expand”. |
|
string |
BK path template with format string placeholders. |
|
dict |
Variables to substitute in the path. Values can be strings or lists. |
Example:
template_job:
wg: B2OC
inform: [charlie]
recipe:
name: "expand"
path: "/LHCb/Collision24/Beam6800GeV-VeloClosed-{polarity}/Real Data/Sprucing{sprucing}/{stream}/CHARM.DST"
substitute:
polarity: ["MagUp", "MagDown"]
sprucing: ["24c3", "24c2"]
stream: "94000000"
options: "charm_analysis.py"
output: "CHARM.ROOT"
This will generate 4 separate jobs (one for each combination of polarity and sprucing) with appropriate BK paths.