Initialize a project
The easiest way to start a project is to request a Project Config template from PIPES.
This outputs a TOML file called my-project.toml which you can edit to include information about the project.
Although one person (PI/Data Manager) is responsible for submitting the Project Config, the Project Config creation process is a team iterative process that will require feedback and collaboration with the entire modeling leadership team in order to best capture the topology and the requirements, scenarios, and assumptions of each Model.
Below we discuss the main pieces of the Project Config file and their purposes. For detailed information on the schema, including type requirements and an example TOML.
Project
Project information encapsulates the basic pieces of integrated modeling projects.
Basic Info
The first block of information to provide when starting a PIPES project contains the Project name, description, assumptions, schedule, milestones, requirements, and owner.
There are two name properties of a Project: name is used for reference in the PIPES database (you can think of this like an ID or shorthand abbreviation), while the full_name property is the proper name that will be used in the UI and other visualizations.
Assumptions are high-level descriptions of project modeling behavior and it can be a list of strings (e.g., like a bulleted list) or a single string. This is a great place to put relevant links related to project assumptions (and are best included using a list of strings, see this example).
The Project schedule should delineate the start and end dates of your project. These dates will constrain dates in other parts of your Project (see Validation for more details). Related to project schedules are milestones which are discrete dates with deliverables that you would like your team to know and be notified about. These dates are also used in the Schedule Tab of the PIPES UI and will likely be referenced regularly by the project team.
Project requirements define metadata requirements on project datasets, which can be refined and altered at other levels in PIPES for specific use cases (see Validation for more detailed information on requirements). Requirements can be thought of as the pieces of metadata that datasets must have to be acceptable for use throughout the project. More requirements means there will be more conditions for dataset acceptance. PIPES uses these requirements to validate
Finally, the Project owner is the general point of contact for the project, typically the project PI.
Scenarios
The block of Project scenarios defines the names and descriptions of all project scenarios that will be studied in the project.
Project runs
Project Runs can be thought of as iterations through a modeling pipeline with particular goals and outputs. Defining a Project Run means providing basic info, pipeline models, and the pipeline topology.
Basic Info
Similar to the basic info of a Project, Project Runs need a name, description, assumptions, scenarios, and a schedule.
The name, description, assumptions, schedule, and requirements of a Project Run have the same functions as those at the project level. To learn more about the relationship between Project requirements and Project Run requirements, see Validation.
Scenarios of a Project Run indicate which Project scenarios are being worked on in the Project Run. New scenarios cannot be defined at the Project Run level. At least one Project scenario needs to be included in the Project Run, but multiple Project scenarios are allowed.
Notice
The name of each Project Run must be unique for a given project.
Models
The models of a Project Run indicate which models PIPES should expect work from, and they form the nodes of the data pipeline. Model info includes model, type, description, assumptions, schedule, requirements, scenario mappings, and expected scenarios.
Model is the name of the model and type is the kind of model (e.g., model="rpm" and type="production cost" ).
The description, assumptions, schedule, and requirements of a model have the same functions as those at the project and Project Run levels. To learn more about the relationship between Project, Project Run, and Model requirements, see Validation.
Scenario mappings allow users to define Model specific scenarios that map to Project scenarios. This is useful when Model scenarios are more general or more specific than the Project scenarios. For example, consider an energy demand model which determines the load data of a project. A Model scenario might be “demand_moderate” which may correspond to several Project scenarios like “earlynobio_moderate”, “notrans_moderate” etc. Scenario mappings allows users to define these mappings, which PIPES will use for determining dataset transformations that will be required throughout the Project Run pipeline. Scenario mappings must include the Model scenario name, the corresponding Project scenarios, and the description of the Model scenario.
Expected scenarios is used to determine progress for a Model. Expected scenarios can either be Project scenarios or Model scenarios. If a model submits datasets to PIPES that are not associated with an expected scenario, then they do not count towards the progress calculation of that Model (see Progress Tracking to learn more). If there are two Models in a Project Run with different expected scenarios that are exchanging datasets, then PIPES will automatically determine that a transformation must occur on handoff datasets between those two Models. For example, if model A has expected_scenarios=["demand_moderate"] and is handing off data to model B with expected_scenarios=["earlynobio_moderate"] then PIPES will automatically determine that a transformation on datasets from model A to model B must occur to account for any differences due to the different scenarios (see Transformations to learn more).
Notice
The model property of each Model node must be unique for a given Project Run. If the same modeling team will perform modeling tasks for two modeling nodes, the optional model_team property can be set on each Model block to link that team to the node.
Topology
The topology of a Project Run represents how the models are linked together and which models will be exchanging data. This is defined within the project_runs.topology block of the Project Config like this:
Topology Example
[[project_runs.topology]]
from_model = "dsgrid"
to_model = "rpm"
[[project_runs.topology.handoffs]]
id = "handoff_id3"
description = "8760 system-level load profiles, including T&D losses, before distributed generation..."
scheduled_start = ""
scheduled_end = "2023-05-20"
notes = ""
The from_model and to_model properties refer to Models defined above in the project_runs.models section of the config. Handoffs consist of datasets and tasks performed on those datasets to pass data from from_model to the to_model during the Project Run. The handoff ID is required and must be unique. A description, schedule dates and notes can also be added.
Modeling Teams
Modeling teams allow specific users to be linked to Projects and Project Runs. An example of this block can be seen at the bottom of this Project Config. If a node representing each user already exists, this project and Project Run will be linked to that existing node. If the user node does not yet exist, it will be created during the Project creation process. A modeling team must be supplied for each Model node included in the Project Runs section.