apd#

Analysis Production Data package.

Programmatic interface to the Analysis Productions database, that allows retrieving information about the samples produced. It queries a REST endpoint provided by the Web application, and caches the data locally.

class apd.AnalysisData(working_group, analysis, metadata_cache=None, data_cache=None, api_url='https://lbap.app.cern.ch', ap_date=None, **kwargs)[source]#

Class allowing to access the metadata for a specific analysis.

Default values for the tags to filter the data can be passed as argument to the contructor. Similarly for the required working group and analysis names. e.g. datasets = AnalysisData(“b2oc”, “b02dkpi”, polarity=”magdown”)

Invoking () returns a list of PFNs corresponding to the requested dataset Keyword arguments are interpreted as tags

Combining all of the tags must give a unique dataset, else an error is raised.

To get PFNs from multiple datasets lists can be passed as arguments. The single call

datasets(eventtype=”27163904”, datatype=[2017, 2018], polarity=[“magup”, “magdown”])

is equivalent to

datasets(eventtype=”27163904”, datatype=2017, polarity=”magup”) + datasets(eventtype=”27163904”, datatype=2017, polarity=”magdown”) + datasets(eventtype=”27163904”, datatype=2018, polarity=”magup”) + datasets(eventtype=”27163904”, datatype=2018, polarity=”magdown”)

all_samples()[source]#

Returns all the samples in this Analysis Production. i.e. without filtering by the default tags

summary(tags: Optional[list] = None) dict[source]#

Prepares a summary of the Analysis Production info.

class apd.ApdReturnType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
LFN = 1#
PFN = 0#
SAMPLE = 2#
class apd.SampleCollection(info=None, tags=None)[source]#

Class wrapping the AnalysisProduction metadata.

LFNs()[source]#

Collects the LFNs

PFNs()[source]#

Collects the PFNs

available_tags()[source]#

returns a superset of all tags used in the samples of the collection

byte_count()[source]#

Collects the number of files from all the samples

file_count()[source]#

Collects total bytecount

filter(*args, **tags)[source]#

Filter the requests according to the tag value passed in parameter

groupby(tags=None)[source]#

Tool that takes the samples and groups them by the tags specified. If no list of tags is specified, then the existing ones are used.

itertags()[source]#

Iterate on all the tags present in all the samples.

report()[source]#

return a report on the samples in the collection

apd.auth(url: str, ignore_nonroot: bool = True) str[source]#

Take a PFN and return one with read-only credentials appended

apd.authw(url: str, ignore_nonroot: bool = True) str[source]#

Take a PFN and return one with read-write credentials appended

apd.cache_ap_info(cache_dir, working_group, analysis, loader=None, api_url='https://lbap.app.cern.ch', ap_date=None)[source]#

Fetch the AP info and cache it locally.

apd.fetch_ap_info(working_group, analysis, loader=None, api_url='https://lbap.app.cern.ch', ap_date=None)[source]#

Fetch the API info from the service

apd.get_analysis_data(working_group, analysis, metadata_cache=None, data_cache=None, api_url='https://lbap.app.cern.ch', ap_date=None, **kwargs)[source]#

Main method to get analysis production information.

Gets the AnalysisData information from the same process if possible. If not loaded already, it loads it from the cache disk and if not present or valid, fetches from the REST API.

apd.load_ap_info_from_single_file(filename)[source]#

Load the API info from a cache file (ONLY FOR TESTS)

Analysis Data#

Interface to the Analysis Production data.

Provides:
  • the get_analysis_data method, the principal way to lookup AP info. It returns

and AnalysisData class. * the AnalysisData class, which allows querying information about Analysis Productions

class apd.analysis_data.AnalysisData(working_group, analysis, metadata_cache=None, data_cache=None, api_url='https://lbap.app.cern.ch', ap_date=None, **kwargs)[source]#

Class allowing to access the metadata for a specific analysis.

Default values for the tags to filter the data can be passed as argument to the contructor. Similarly for the required working group and analysis names. e.g. datasets = AnalysisData(“b2oc”, “b02dkpi”, polarity=”magdown”)

Invoking () returns a list of PFNs corresponding to the requested dataset Keyword arguments are interpreted as tags

Combining all of the tags must give a unique dataset, else an error is raised.

To get PFNs from multiple datasets lists can be passed as arguments. The single call

datasets(eventtype=”27163904”, datatype=[2017, 2018], polarity=[“magup”, “magdown”])

is equivalent to

datasets(eventtype=”27163904”, datatype=2017, polarity=”magup”) + datasets(eventtype=”27163904”, datatype=2017, polarity=”magdown”) + datasets(eventtype=”27163904”, datatype=2018, polarity=”magup”) + datasets(eventtype=”27163904”, datatype=2018, polarity=”magdown”)

all_samples()[source]#

Returns all the samples in this Analysis Production. i.e. without filtering by the default tags

summary(tags: Optional[list] = None) dict[source]#

Prepares a summary of the Analysis Production info.

class apd.analysis_data.ApdReturnType(value, names=<not given>, *values, module=None, qualname=None, type=None, start=1, boundary=None)[source]#
LFN = 1#
PFN = 0#
SAMPLE = 2#
apd.analysis_data.get_analysis_data(working_group, analysis, metadata_cache=None, data_cache=None, api_url='https://lbap.app.cern.ch', ap_date=None, **kwargs)[source]#

Main method to get analysis production information.

Gets the AnalysisData information from the same process if possible. If not loaded already, it loads it from the cache disk and if not present or valid, fetches from the REST API.

Command#

apd.command.common_docstr(sep='\n')[source]#

Append the common help to all the functions docstring

apd.command.exception_handler(exception_type, exception, _)[source]#

AP Info#

Internal tools to load and interpret information from the AnalysisProductions data endpoint.

This modules contains the retrieve the data from the AnalysisProductions endpoint (with the APDataDownloader class). It returns JSON that can be loaded into a SamplesCollection instance.

class apd.ap_info.APDataDownloader(api_url='https://lbap.app.cern.ch')[source]#

Utility class that fetches the Analysis Production information.

get_ap_info(working_group, analysis, ap_date=None)[source]#
get_ap_tags(working_group, analysis, ap_date=None)[source]#
get_user_info()[source]#
exception apd.ap_info.InvalidCacheError[source]#

Exception to signal that the AP info cache is invalid

class apd.ap_info.SampleCollection(info=None, tags=None)[source]#

Class wrapping the AnalysisProduction metadata.

LFNs()[source]#

Collects the LFNs

PFNs()[source]#

Collects the PFNs

available_tags()[source]#

returns a superset of all tags used in the samples of the collection

byte_count()[source]#

Collects the number of files from all the samples

file_count()[source]#

Collects total bytecount

filter(*args, **tags)[source]#

Filter the requests according to the tag value passed in parameter

groupby(tags=None)[source]#

Tool that takes the samples and groups them by the tags specified. If no list of tags is specified, then the existing ones are used.

itertags()[source]#

Iterate on all the tags present in all the samples.

report()[source]#

return a report on the samples in the collection

apd.ap_info.cache_ap_info(cache_dir, working_group, analysis, loader=None, api_url='https://lbap.app.cern.ch', ap_date=None)[source]#

Fetch the AP info and cache it locally.

apd.ap_info.check_tag_value_possible(tag, values, available_tags)[source]#

Check if the tag exists in the in the available_tags of similar name, in the sense of difflib.get_close_matches. If yes, also check that the value is one of the existing tags values for that tag

apd.ap_info.fetch_ap_info(working_group, analysis, loader=None, api_url='https://lbap.app.cern.ch', ap_date=None)[source]#

Fetch the API info from the service

apd.ap_info.iterable(arg)[source]#

Version of Iterable that excludes str.

apd.ap_info.load_ap_info(cache_dir, working_group, analysis, ap_date=None, maxlifetime=None)[source]#

Load the API info from a cache file

apd.ap_info.load_ap_info_from_single_file(filename)[source]#

Load the API info from a cache file (ONLY FOR TESTS)

apd.ap_info.safe_casefold(a)[source]#

Casefold that can be called on any type, does nothing on non str.