Skip to content

Latest commit

 

History

History
27 lines (15 loc) · 6.8 KB

CLOUD_SANDBOX_MODEL_INTEGRATION_TEMPLATE.md

File metadata and controls

27 lines (15 loc) · 6.8 KB
  1. Once you’ve logged into the head node of Cloud-Sandbox, go into the /save/ec2-user directory and create a new directory based on your affiliated organization. (Ex. mkdir OWP). Go into the newly made directory for the next step.

  2. Clone the latest Cloud-Sandbox code base off the GitHub repository: git clone https://github.com/ioos/Cloud-Sandbox.git

  3. Change directory into Cloud-Sandbox/cloudflow for the following next steps.

  4. Change directory into job/jobs. Copy the cloud_sandbox.template file into a new file called “your_model_name.template”. Inside that new file, you must specify the model name (OFS), model job type (JOBTYPE), executable (EXEC), any executable dependencies, and where a given user model setup (MODEL_DIR) is located on the Cloud-Sandbox at minimum. Once completed, go back to the cloudflow directory.

  5. Change directory into cluster/config. Copy over the template.ioos file into a new file called “your_model_name.ioos” file to specify the AWS cloud resource configuration you would like for your given model to run on. In that file you can edit the following variables: “nodeType” (eligible AWS node instances are stated within the cloudflow/cluster/nodeInfo.py Python script), “nodeCount” (number of node instances you want to utilize within the given AWS node instance. A word of caution as the Cloud-Sandbox AWS account does have caps on the number of nodes you can allocate for a given instance), and “tags” (The values for “Name” and “Project” should reflect your model name and affiliation). Once completed, go back to the cloudflow directory.

  6. Change directory into workflows. Within that directory, you will want to modify the variable “template_config” within the workflow_template.py file to essentially reflect the “your_model_name.ioss” file name you’ve constructed in Step #5. Once completed, go back to the cloudflow directory.

  7. Change directory into job. Within that directory, you will want to copy over the file called “Template.py” to “your_model_name.py”. Within the “your_model_name.py” file, you will see that the original configuration of this file is reflected strictly based on the cloud_sandbox.template file you’ve copied and modified in Step #4. Rename the “Template” Python class name to “your_model_name” so this can now reflect a unique Python class for your own specific model. If your model execution only needs to know essentially the location of the model run directory and then executable itself, then you don’t need to modify anything else in this file. IF your model executable needs more information (Ex: model argument, specific model libraries to be linked) that you’ve included “your_model_name.template” file in Step #4, then you will need to include that information within the “parseConfig” function inside your new Python class.

  8. Now, go into the JobFactory.py file and at the very top of the script, you will now inset a new import statement to reflect the new “your_model” Python class you’ve constructed from the “your_model_name.py” file you created in Step #7 (Ex: from cloudflow.job.your_model_name import your_model_name). Inside the “JobFactory” Python class, you will see a “job” function defined. Inside that job function, you will need to insert an “elif” statement that reflects your “jobtype” variable defined in your “your_model_name.template” file constructed in Step #4 as well as the calling of the new Python class you also created in Step #7. Once completed, go back to the cloudflow directory.

  9. IF your model executable does not need more information then just simply the pathway to the model run directory and the executable itself that you’ve specified “your_model_name.template” file in Step #4, then skip this step and move to the next one. Otherwise, change directory into workflows and go into the tasks.py file. Inside the tasks.py file, search and find the “template_run” function. Inside that function, you need to include an “elif” statement within the code block to essentially include the special extra arguments to include in the launcher script that you will modify in the following step. Follow along code logic like what was highlighted in the “schism” and “dflowfm” code blocks. Once completed, go back into the cloudflow directory.

  10. Go into the workflow directory. Now, you will go modify the “template_launcher.sh” file that essentially controls the launching of the model runs inside the Cloud-Sandbox. IF you’ve had to complete Step #9 due to model information required to kick start the executable, then you will need to also set an “if” shell script block to ingest the special input argument. You can simply follow along with the code blocks for “schism” and “dflowfm” with the export statements near the top of the shell script. Next, go towards the bottom of the script where you see a set of if/elif blocks of code for specific model suites. In this section, you will want to construct a code block for “your_model” that essentially points to some shell launcher script. That launcher script for “your_model” will be constructed in the next step, where it takes specific arguments required to run your model. The default requirements for each of the launcher scripts in general would be the model run directory and the pathway to the model executable. If more is required for your given model to launch, then include those as well similar to the code logic you see in the other if/elif code blocks.

  11. Copy the “model_template_run.sh” file to a new file called “your_model_run.sh”. Inside that file, you will see the general workflow template that you will need to modify to essentially include (1) loading the required compilers and libraries used to compile your given model suite, (2) Any required environmental variables needed to export for the AWS cloud environment to run your given model executable and (3) the method to call and run your overall model with the given executable.

  12. Now, go back into the “template_launcher.sh” shell script and ensure that the if/elif code block you’ve constructed in Step #11 is matching the syntax of calling that specific shell script and that code logic also includes the script arguments required to properly run the given model suite. Once the modification and code check are completed, go back to the cloudflow directory.

  13. Finally, we can now attempt to run the model! Follow the steps below to submit the Cloud-Sandbox job submission to the background of the head-node and monitor your job progress accordingly:

  • nohup python3 ./workflows/workflow_template.py job/jobs/your_model_name.template > your_model_test.out &
  • tail -f your_model_test.out to see the progress of the overall Cloud-Sandbox execution of the new given model you’ve just finished integrating into the model suite.