Dataset management tools
The usage of all the available gb2 tools is shown.
In most cases, examples of how to use them can be obtained with the flag --usage
.
gb2_ds_get
Download remote file to local directory (Default: current directory) –new option to download using rucio. This is experimental functionality, and will eventually be a default way. Examples:
$ gb2_ds_get myProject
$ gb2_ds_get "project/sub00/file00*.root"
$ gb2_ds_get myproject --new
usage: gb2_ds_get [-h] [-v] [--usage] [-o local directory] [-u USER] [-r {MC,data,user}] [-l] [--new] [-f] [--noSubDir] [-i file with lfns] [--failed_lfns filename.txt] [--se SE]
dataset
Positional Arguments
- dataset
specify dataset name
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
- -o, --output_dir
path to local directory
- -u, --user
specify user name
- -r, --subcate
Possible choices: MC, data, user
specify a dataset category
- -l, --long
long listing (-ll: extra long)
- --new
Enables experimental feature(s) for the tool.
Default:
False
- -f, --force
skip confirmation
Default:
False
- --noSubDir
Avoid downloading of files in subdirectories of the given dataset
Default:
False
- -i, --input_dslist
Input file with list of LFNs to download
- --failed_lfns
Set the name of the text file where failed LFNs will be stored
- --se
Select an SE
gb2_ds_list
List datasets or files in specified directory . Files only with the status ‘good’ in the metadata catalog are shown by default. Use the option -s to list the files with other statuses, and ‘-s all’ to list all the files. Examples:
$ gb2_ds_list -u username
$ gb2_ds_list "/belle/MC/signal/B2DstpDstm/mcprod*/BGx*"
$ gb2_ds_list dataset -l -g
$ gb2_ds_list dataset -s all
$ gb2_ds_list dataset -s good
usage: gb2_ds_list [-h] [-v] [--usage] [-u USER] [-r {MC,data,user}] [-l] [-g] [-s STATUS] [dataset]
Positional Arguments
- dataset
specify dataset name
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
- -u, --user
specify user name
- -r, --subcate
Possible choices: MC, data, user
specify a dataset category
- -l, --long
long listing (-ll: extra long)
- -g, --group_by_se
groups by SEs
- -s, --status
specify status of file
gb2_ds_du
Show disk usage of specific datasets. (Default: /belle/user/USER/* )
All user’s are specified by ‘-u all’.
Examples:
$ gb2_ds_du 7345232
$ gb2_ds_du -u username "proj1*"
usage: gb2_ds_du [-h] [-v] [--usage] [-u USER] [-r {MC,data,user}] [--noBar] [dataset]
Positional Arguments
- dataset
specify dataset name
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
- -u, --user
specify user name
- -r, --subcate
Possible choices: MC, data, user
specify a dataset category
- --noBar
disable status bar
Default:
False
gb2_ds_count_events
Prints the number of events for each file in a dataset or for each dataset in a datablock.
Accept SQL like syntax for metadata query String should be quoted by single or double quatation
Examples:
% gb2_ds_count_events datasets
% gb2_ds_count_events -u username "dataset*"
% gb2_ds_count_events -q "status='good' and runHigh>100" dataset
% gb2_ds_count_events --summary --output_json "output_name.json" dataset
usage: gb2_ds_count_events [-h] [-v] [--usage] [-u USER] [-q QUERY] [--summary] [--output_json OUTPUT_JSON] dataset [dataset ...]
Positional Arguments
- dataset
specify dataset(s) name
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
- -u, --user
specify user name
- -q, --query
query for metadata
- --summary
total number of events and number of files
Default:
False
- --output_json
specify json file name
gb2_ds_query_file
Query file metadata
Accept SQL like syntax for metadata query String should be quoted by single or double quotation
Examples:
$ gb2_ds_query_file dataset
$ gb2_ds_query_file -u username dataset
$ gb2_ds_query_file -m "status:software" dataset
$ gb2_ds_query_file -q "status='good'" dataset
$ gb2_ds_query_file -q "runL>5 and runH<10" dataset
$ gb2_ds_query_file -q "runH<100" "/belle/MC/generic/ccbar/mcprod1405/BGx1/"
usage: gb2_ds_query_file [-h] [-v] [--usage] [-C CONF] [-m META] [-q QUERY] [-u USER] [-r {MC,data,user}] [--output_csv OUTPUT_CSV] [-l] dataset [dataset ...]
Positional Arguments
- dataset
specify dataset(s) name
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
- -C, --conf
specify configuration file
- -m, --meta
specify metadata attribute list
- -q, --query
query for metadata
- -u, --user
specify user name
- -r, --subcate
Possible choices: MC, data, user
specify a dataset category
- --output_csv
specify csv file name
- -l, --long
long listing (-ll: extra long)
gb2_ds_query_dataset
Query dataset metadata
Accept SQL like syntax for metadata query String should be quoted by single or double quotation
Examples:
$ gb2_ds_query_dataset dataset
$ gb2_ds_query_dataset -u username "dataset*"
$ gb2_ds_query_dataset -m "software:desc" dataset
$ gb2_ds_query_dataset -q "software='release-00-04-01'" dataset
usage: gb2_ds_query_dataset [-h] [-v] [--usage] [-C CONF] [-m META] [-q QUERY] [-u USER] [-r {MC,data,user}] [-l] dataset [dataset ...]
Positional Arguments
- dataset
specify dataset(s) name
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
- -C, --conf
specify configuration file
- -m, --meta
specify metadata attribute list
- -q, --query
query for metadata
- -u, --user
specify user name
- -r, --subcate
Possible choices: MC, data, user
specify a dataset category
- -l, --long
long listing (-ll: extra long)
gb2_ds_query_datablock
Query datablock metadata
Accept SQL like syntax for metadata query String should be quoted by single or double quotation
Examples:
$ gb2_ds_query_datablock datablock
$ gb2_ds_query_datablock dataset/sub00
$ gb2_ds_query_datablock -m "size:nFiles" dataset
$ gb2_ds_query_datablock -q "nFiles=1000" dataset
usage: gb2_ds_query_datablock [-h] [-v] [--usage] [-C CONF] [-m META] [-q QUERY] [-u USER] [-r {MC,data,user}] [-l] dataset [dataset ...]
Positional Arguments
- dataset
specify dataset(s) name
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
- -C, --conf
specify configuration file
- -m, --meta
specify metadata attribute list
- -q, --query
query for metadata
- -u, --user
specify user name
- -r, --subcate
Possible choices: MC, data, user
specify a dataset category
- -l, --long
long listing (-ll: extra long)
gb2_ds_rep
Replicate datablocks to other SE. Input datasets are resolved into datablocks.
By default replication rule will be asscoiated with your account. If -u <username> is provided then the rule will be associated to that account. Specify –lifetime xh/xd/xw/xm to provide a custom value. (default to 1 m(onth)). Replica will be deleted after lifetime expires.
Examples:
% gb2_ds_rep /belle/user/anil123/myproject/sub00 -d DESY-TMP-SE
% gb2_ds_rep /belle/user/anil123/myotherproject/sub00 -d KIT-TMP-SE -u belle_dcops
% gb2_ds_rep /belle/user/anil123/myotherproject1 -d KIT-TMP-SE --lifetime 2w
usage: gb2_ds_rep [-h] [-v] [--usage] [-s SE] -d SE [-u USER] [-f] [--lifetime LIFETIME] dataset [dataset ...]
Positional Arguments
- dataset
specify dataset(s) name
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
- -s, --src_se
source SE
- -d, --dst_se
destination SE
- -u, --user
specify user name
- -f, --force
skip confirmation
Default:
False
- --lifetime
set lifetime for LPN. xh(our) , xd(ay), xw(eek), xm(onth). Default: 1m(onth)
gb2_ds_rep_status
Check the status of replication after the request with gb2_ds_rep.
Examples:
$ gb2_ds_rep_status /belle/user/anil123/myproject/sub00 $ gb2_ds_rep_status /belle/user/anil123/myotherproject
usage: gb2_ds_rep_status [-h] [-v] [--usage] [-l] dataset [dataset ...]
Positional Arguments
- dataset
specify dataset(s) name
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
- -l, --long
long listing (-ll: extra long)
gb2_ds_rm
Asynchronously removes files and metadata associated with the dataset or project name provided. All replicas on the SEs are deleted.
Examples:
$ gb2_ds_rm project_name
$ gb2_ds_rm "/belle/user/hideki/project_*"
$ gb2_ds_rm -u somebody project_name
$ gb2_ds_rm -f project_name
usage: gb2_ds_rm [-h] [-v] [--usage] [-f] [-u USER] [-r {MC,data,user}] [--noBar] dataset [dataset ...]
Positional Arguments
- dataset
specify dataset(s) name
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
- -f, --force
skip confirmation
Default:
False
- -u, --user
specify user name
- -r, --subcate
Possible choices: MC, data, user
specify a dataset category
- --noBar
disable status bar
Default:
False
gb2_ds_siteForecast
List possible job execution sites based on replica location. Note that site availability is not considered.
Examples:
$ gb2_ds_siteForecast /belle/MC/release-02-00-01/DB00000411/MC11/prod00005218/s00/e0000/4S/r00000/ddbar/mdst/sub00
usage: gb2_ds_siteForecast [-h] [-v] [--usage] [-u USER] [-r {MC,data,user}] [dataset]
Positional Arguments
- dataset
specify dataset name
Named Arguments
- -v, --verbose
increase verbosity (up to -vv)
Default:
0
- --usage
show detailed usage
- -u, --user
specify user name
- -r, --subcate
Possible choices: MC, data, user
specify a dataset category