Checkin a dataset
Checking in data with PIPES can be done in two ways: either as a standalone call or in conjunction with a task submission.
Method 1: Standalone call
-
Request the metadata template from PIPES: (see Tip below on automation if needed)
Where
<system-type>indicates the system where the dataset is stored, which tells PIPES what metadata to expect.Info
For PIPES MVP, supported storage system types are one of: - ESIFRepoAPI - AmazonS3 - HPCStorage - DataFoundry
Additional storage options will be added with future releases. If you have a specific storage type that you need, please reach out to the PIPES development team.
See the schema requirements for more information on the metadata fields in the Dataset Config file.
-
Fill in the metadata fields in the config file.
-
Submit it to PIPES using the following command:
$ pipes dataset checkin -p project_name -r project_run_name -m model_name -x model_run_name -f path/to/toml/dataset.tomlTip
If you would like to automate Dataset checkins, you do not need to request a template from PIPES. Instead, your application can reference the API directly to produce a config file that meets the schema requirements for Datasets and then submit the config to PIPES.
Method 2: In conjunction with transformation task submission
Warning
Please note that this method only applies if your dataset is a result of transformation task. Otherwise, please use the direct command in Method 1.
-
Get the transformation Task Creation Config template,
-
Then fill in this config template.
-
Get a Dataset Config checkin template by specifying a system type - ESIFRepoAPI , AmazonS3 , HPCStorage, or DataFoundry.
-
Submit the transformation task with transformed dataset generated from this task:
$ pipes task submit -p test1 -r 1 -m dsgrid -x model-run-1 -f transformation-task-submission1.toml -d my-transformed-dataset.toml --task-passFor additional information, please refer to submitting a transformation task.
-
For more description of datasets, please see the Dataset Reference.
-
For more specifics on the metadata keys and their types in the Dataset template, check out the Dataset Config.