Validation
PIPES performs several steps of validation to help users catch errors and maintain consistency throughout a project.
Requirements
PIPES uses requirements to validate metadata of datasets, and it informs the progress status of the project.
Examples
The example requirement above indicates that modeling work, and therefore dataset outputs, should be relevant to the years 2025 and 2030.
Requirements can be single key/value pairs, as in the example above, or any valid nested TOML key/value pairs such as:
The Pydantic schemas for the various PIPES objects have several pre-defined requirement keys (see Configs for more info). If a requirement key is not pre-defined by PIPES, the key/value pair will be stored in a field called other
in the Pydantic schema.
How requirements validation works
Requirements are specified at three different node types in PIPES: project, project run, and model. Requirements at higher-level nodes (e.g., project) are superseded by requirements at lower-level nodes (e.g., models).
For example, consider the following PIPES graph:
Let’s assume the nodes in this graph have the following requirements:
-
Project requirements:
-
Project run requirement:
-
Model A requirements:
When Dataset 1
is checked in, PIPES will validate the metadata with the following requirements metadata:
In other words, it uses all requirements defined at the model level first. Any requirements listed at the project run that are not included in the model requirements are included, such as the geographic extent. Finally, any project requirements not overwritten by the project run or model are used, such as the weather year. This allows projects to scope work and build up results iteratively.
Now let’s assume that Model B has the following requirements:
In order for Model A to handoff data to Model B, a dataset needs to meet the following requirements:
In other words, Dataset 1
needs to be transformed such that the model years and geographic extent match what Model B expects to ingest. When Dataset 1 Transformed
is checked-in, PIPES will verify that its metadata meets the requirements expected by Model B. If these requirements are not met, the data is not ready to be handed off and more work needs to be done. If the dataset does meet these requirements, the Model B team will be notified that they have data ready from Model A.
Schedule validation
PIPES ensures that all dates are within valid ranges.
- For project, project run, and model, the scheduled start is before the scheduled end.
- Project run schedule is within project schedule.
- Model schedule is within project run schedule.
- Handoffs schedule is within the project schedule.
ID name uniqueness
At the project scope level:
- Model names are unique.
- Project scenario names are unique.
- Handoff IDs are unique.
Note
Model names have ids and optionally “types”. This allows users to define two models of the same type but with different purposes, e.g., to perform circular work.
At the model scope level:
- Model scenario names are unique.
At the model run scope level:
- Task IDs are unique.
Resource existence
PIPES checks for existence of resources specified in the input.
- Validate that user specified tasks exist.
PIPES validate is a critical step for ensuring the integrity and consistency of the project configuration.