Dataset Searcher tools
Tools for searching datasets on the grid are shown.
In most cases, examples of how to use them can be obtained with the flag --usage
.
gb2_ds_search
Search for all Dataset/metadata in DatasetSearchIndexDB.
To search for metadata use: gb2_ds_search metadata –<flag_name> <flag_option>
To search for dataset use: gb2_ds_search dataset –<flag_name> <flag_option>
To know which flag are available for both metadata and dataset option use: gb2_ds_search metadata/dataset –help
The output .txt file can be used as input for gbasf2 via the –input_dslist option
Examples:
$ gb2_ds_search metadata --table Campaigns
$ gb2_ds_search dataset --data_type data --data_level udst --skim_decay 14140101 --campaign SkimP10x1
$ gb2_ds_search dataset --data_type data --general_skim hadron --campaign proc11 -o listOfFiles.txt
Another way to search datasets using gb2_ds_search involves a similar approach of search as in gbasf2.
–input_ds_search option allows user to submit all the options together as a string of key-value pairs, assigned by an equals sign, and separated by a semicolon.
Parameters from the string passed under this option takes preference over individual options. Individual options can still be provided along with the string. In case of those options missing from the string, the searcher will add the individual options in our search criteria. In case of an conflict between option specified in string and as an individual option, a warning is issued and option in the string is chosen as the search criteria.
Examples:
$ gb2_ds_search dataset –input_ds_search “data_type=data;data_level=udst;general_skim=hadron” The above example is same as:
$ gb2_ds_search dataset –data_type data –data_level udst –general_skim hadron
$ gb2_ds_search dataset –input_ds_search “data_type=data;data_level=udst” –general_skim hadron The above example adds general_skim = hadron in our search criteria.
$ gb2_ds_search dataset –input_ds_search “data_type=data;data_level=udst;general_skim=hadron” –data_level mdst The above examples issues a warning for redundant entries and chooses option specified in –input_ds_search (data_level = udst) as a search filter criteria.
Note: Please ensure correct syntax is used in input_ds_search string, and that the search string is in under proper quotes.
gb2_ds_search allows multiple campaigns(separated by ,) for dataset search. Search results shows the number of datasets found corresponding to each
Example:
$ gb2_ds_search dataset --data_type mc --campaign 'MC12b, MC13a, MC14rd_c'
$ gb2_ds_search dataset --input_ds_search="data_type=mc;campaign=MC12b, MC13a"
usage: gb2_ds_search [-h] [-v] [--usage] {metadata,dataset,collection} ...
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
Sub-commands
metadata
Searches for dataset LPNs matching the given metadata constraints.
gb2_ds_search metadata [-h] --table {Releases,GlobalTags,MCEventTypes,GeneralSkimNames,SkimDecayModes,Campaigns,DataTypes,DataLevels,BeamEnergies,BkgLevels}
Named Arguments
- --table
Possible choices: Releases, GlobalTags, MCEventTypes, GeneralSkimNames, SkimDecayModes, Campaigns, DataTypes, DataLevels, BeamEnergies, BkgLevels
Specify a table to search values from .
dataset
Returns available metadata values.
gb2_ds_search dataset [-h] [-o OUTPUT_FILE] [--campaign CAMPAIGN] [--data_type DATA_TYPE] [--data_level DATA_LEVEL] [--run_high RUN_HIGH] [--exp_high EXP_HIGH] [--run_low RUN_LOW]
[--exp_low EXP_LOW] [--mc_event MC_EVENT] [--skim_decay SKIM_DECAY] [--general_skim GENERAL_SKIM] [--beam_energy BEAM_ENERGY] [--production_id PRODUCTION_ID]
[--global_tag GLOBAL_TAG] [--release RELEASE] [--input_ds_search INPUT_DS_SEARCH] [--bkg_level BKG_LEVEL]
Named Arguments
- -o, --output_file
Output a text file containing all matching datasets.
- --campaign
The MC or Data production campaign name.
- --data_type
mc or data
- --data_level
udst, mdst, etc
- --run_high
The highest allowed run number(INTEGER VALUE)(inclusive).
- --exp_high
The highest allowed Experiment number (INTEGER VALUE) (inclusive).
- --run_low
The lowest allowed Run number (INTEGER VALUE) (inclusive).
- --exp_low
The highest allowed Experiment number (INTEGER VALUE) (inclusive).
- --mc_event
The MC event type (“uubar”, “1110043100”, etc) used for
- --skim_decay
The skim type used to reconstruct and select events.
- --general_skim
The general skim name (all, hadron, etc)
- --beam_energy
4S, 5S, etc
- --production_id
The production ID, or a range of the production IDs of dataset(s). (2333 or 2334:2444)
- --global_tag
The global tag used to create the dataset.
- --release
The basf2 release used to create the dataset.
- --input_ds_search
Input Search Parameters separated by semicolon, same as in gbasf2 –input-ds-search
- --bkg_level
Background Level for MC .
collection
“Returns a list of collections, or its content and metadata.
gb2_ds_search collection [-h] [--list_all_collections [LIST_ALL_COLLECTIONS]] [--get_metadata GET_METADATA] [--list_datasets LIST_DATASETS]
Named Arguments
- --list_all_collections
specify collection with wildcard. Default value e.g. /belle/collection/(generalorMCorData/*
- --get_metadata
specify collection name
- --list_datasets
specify collection name