Authors: Rebecca Vandewalle rcv3@illinois.edu, Zimo Xiao zimox2@illinois.edu, Furqan Baig fbaig@illinois.edu, and Anand Padmanabhan apadmana@illinois.edu
Last Updated: 10-10-21
CyberGIS-Compute enables students and researchers from diverse backgrounds to take advantage of High Performance Computing (HPC) resources without having to delve into the details of system setup, maintenance and management.
CyberGIS-Compute is designed to run on the CyberGISX platform, which uses Virtual ROGER (Resourcing Open Geospatial Education and Research), a geospatial supercomputer with access to a number of readily available popular geospatial libraries. A major goal of CyberGISX is to provide its users with straightforward access software tools, libraries and computational resources for reproducible geospatial research and education, so they may focus their efforts on research and application development activities instead of software installation and system engineering. CyberGIS-Compute provides a bridge between the regular capabilities of CyberGISX and powerful HPC resources so its users can leverage these computational resources for geospatial problem solving with minimal effort and a low learning curve.
In example 1: the hello world example, we will learn
Hello World
example on HPC using CyberGIS-ComputeIn example 2: the custom code example, we will learn
In example 3: the user interface example, we will learn
In example 4: the evacuation example, we will learn
To best understand this notebook, you should ideally have:
Run the following code cell to load CyberGIS-Compute, the backend software development kit (SDK). This will allow us to be able to access and work with High Performance Computing (HPC) resources within CyberGISX.
Note: This cell will generate considerable text output.
# Load the CyberGIS-Compute client
import sys
!{sys.executable} -m pip install --ignore-installed git+https://github.com/cybergis/job-supervisor-python-sdk.git@v2
After installing the client using the previous cell you must restart the Kernel! Then, run each code cell individually! The notebook will not work if you select Restart & Run All
.
Before we get to the example, it is helpful to introduce some key terms.
A typical High Performance Computing (HPC) job in CyberGISX using CyberGIS-Compute consists of the following major components:
To gain familiarity with the CyberGIS-Compute environment, we first present a simple Hello World
example. We will learn how to write code to initialize and work with some of the CyberGIS-Compute components mentioned above.
As mentioned earlier, every notebook using CyberGIS-Compute has to create a CyberGISCompute
object to interact with the broader system. The following cell imports the required library from the Python Software Development Kit (SDK) that was downloaded and installed
in the Setup section.
# Load CyberGIS-Compute client
from cybergis_compute_client import CyberGISCompute
After importing CyberGIS-Compute, we first need to initialize a CyberGISCompute
object. v2
specifies that we want to use the 2nd version.
# Create CyberGIS-Compute object
cybergis = CyberGISCompute(suffix='v2')
To work with CyberGIS-Compute we need to login. This helps track use of computing resources.
# Login
cybergis.login()
In this example we are running a prebuilt job. You can see what jobs are available in the current environment using the list_git()
function for the CyberGISCompute
object we created above. Prebuilt jobs are stored using GitHub repositories. GitHub is common online storage place for code.
# Show available jobs
cybergis.list_git()
In the next line, you will create a simple job object using the create_job()
function. The variable that this function result is assigned to will be used for further interactions with the job.
# Create a job
demo_job = cybergis.create_job()
In the next line, you will create a simple Hello World
job by setting the executableFolder
value to the GitHub folder short link for the job. This job also expects a variable named a
to be set. We will discuss the process of setting additional options more later.
# Set job options
demo_job.set(executableFolder="git://hello_world", param={"a": 32})
The next step is to submit the job created by the maintainer using the submit()
function. Once a job is submitted, the code will be sent to and run on the selected High Performance Computing backend resources.
# Submit the job
demo_job.submit()
Once you have submitted a job, there is a lot going on in the backend. Since CyberGIS-Compute is implemented as a public shared service, a submitted job may not necessarily be executed right away. Instead, all jobs are added to a queue and then scheduled one by one based on available resources. Once a job starts executing, it goes through several steps, such as:
All of these job stages can be observed in real time once a job is submitted using the events()
function, with the liveOutput
parameter set to True
, which you can see in the following cell.
# View job events
demo_job.events(liveOutput=True)
A submitted job usually goes through a regular life cycle, as mentioned in previous step. The status of a submitted job provides limited information about the current stage a job is in. Additional useful information can be retrieved through logs. Logs are very important when you debugging code for jobs that execute on remote resources. You can access logs for a job using the logs()
function, with the liveOutput
parameter set to True
, as shown below.
For this Hello World
example, when the job has been successfully submitted and executed, the log will display a message about what scripts are running and print the job parameters.
# View job logs
demo_job.logs(liveOutput=True)
This example illustrated the necessary steps to get started with CyberGIS-Compute on the CyberGISX environment. We went through the setup, basic terminologies and life cycle of a simple Hello World
job.
In our next step, we will dive more into the details of customizing maintainers to be able to create and execute user specified code in CyberGISX.
In the previous example, we ran some code already written by a developer. However, you may want to change how the job runs by setting job parameters and variables.
In this example, we will further discuss how to set parameters and variables to customize how the job uses HPC resources on the CyberGISX environment using CyberGIS-Compute.
# Create a custom job
custom_job = cybergis.create_job()
You can list the HPC resources available in the current environment using the list_hpc()
function for the CyberGISCompute
object we created above. Right now we will be using the default keeling_community
HPC backend, but you can change the hpc
variable within the create_job()
function in order to run the code on different computing resources.
# List available HPC resources
cybergis.list_hpc()
Again we will need to specify a Git repository containing the code. For security reasons, only approved git repositories are allowed. The list of approved and available repositories can be listed by using the list_git()
function, as shown in the following cell.
# List approved GitHub repositories
cybergis.list_git()
As above, we will use git://hello_world
which points to the https://github.com/cybergis/cybergis-compute-hello-world.git
git repository.
A job that runs using remote HPC resources can be divided into three primary stages: initialize, execute, and finalize. When we create custom code, we can specify what files should be run at each of these stages. The git://hello_world
repository contains three python files, one for each stage.
Initialization Stage
setup.py
which simply prints a message running setup
.Execution Stage
main.py
which simply prints a message running main
.Finalization Stage
setup.py
which simply prints a message running cleanup
.Note that the only stage that needs to be set is the Execution State.
A code repository intended for use with CyberGIS-Compute needs to have a manifest.json
file, which specifies which code to execute in the different job stages.
A typical manifest.json
will have the following format
{
"name": "hello world",
"container": "python",
"execution_stage": "python {{JOB_EXECUTABLE_PATH}}/main.py"
}
Note that {{JOB_EXECUTABLE_PATH}}
will be replaced during the remote job submission process by the actual path needed to run the job.
This example is setup to run a Python script, however other commands can also be used such as "execution_stage": "bash run_job.sh"
.
Now we need to specify the git repository containing the code for the job object. You can do this using the set()
function and the executableFolder
parameter. This job also needs a custom variable passed to this. You do this by passing a dictionary called param
to the set()
function. Note that we are passing a variable called a
to CyberGISCompute. Within the job, this variable will be accessed using the name param_a
.
# Specify GitHub repository
custom_job.set(executableFolder='git://hello_world', param={"a": "param a is set"})
Now we have everything setup to submit our new job with custom code specified at the configured Git repository. These last processes of submitting and tracking the job status work the same as in the first example.
# Submit custom job
custom_job.submit()
You can get information about the job using the status()
function as shown below.
# Check custom job status
custom_job.status()
Next you can track the job progress.
# Check custom job events
custom_job.events(liveOutput=True)
As above, you can check the job output using the logs()
function.
# Check custom job logs
custom_job.logs(liveOutput=True)
Finally, you can download the output and error messages using the downloadResultFolder()
function. The zip folder will be placed in the same location as this notebook.
# create results folder to store downloaded results
import os
if not os.path.isdir("custom_result"):
os.mkdir("custom_result")
# Download results from custom job
custom_job.downloadResultFolder('./custom_result')
This example illustrated the necessary steps to run custom code on the CyberGISX environment using CyberGIS-Compute. We went quickly through how to set up and specify a git repository that contains code and ran the custom job by specifying where to access the code.
Future examples will demonstrate how to run existing maintainers with custom data and how to run custom code that uses custom data.
We are working on a graphical job submission user interface to simplify the job submission process and help make sure the submitted options make sense for the submitted job. Although this interface is under active development, it still can be used to select and run jobs.
Try setting the Git Repository to git://cybergis-compute-modules-test
.
cybergis.create_job_by_UI()
In this final example, we will use CyberGIS-Compute to run a more complex example based on an agent based model of evacuation.
Using the commands described above, we can review the available resources, maintainers, and custom code repositories.
# Review CyberGIS-Compute resources
cybergis.list_git()
Next, as above, we create a job, set the GitHub repository to the evacuation repo, and submit the job.
# Create job, set GitHub repo, and submit job
demo_job = cybergis.create_job()
demo_job.set(executableFolder="git://fireabm", param={"start_value": 20},
slurm = {"num_of_task": 2, "walltime": "10:00"})
demo_job.submit()
We can now view the job events as it is running.
# View job events
demo_job.events(liveOutput=True)
Once the job has finished, you can view logs. In this case, information about the running software is displayed.
# View job logs
demo_job.logs(liveOutput=True)
Now we can save the completed evacuation computation results to a local folder and extract them from the zip file.
# create results folder to store downloaded results
import os
if not os.path.isdir("fireabm_result/result/"):
os.makedirs("fireabm_result/result/")
# Save results to zip
result_zip = demo_job.downloadResultFolder('./fireabm_result')
# Extract results
import os, zipfile
extract_results_to = "fireabm_result/result/"
with zipfile.ZipFile(result_zip, 'r') as zip_ref:
zip_ref.extractall(extract_results_to)
Finally, we can visualize the results of the computation and plot the result.
# Display results from the job
import glob
from IPython.display import Video, HTML
rfile = glob.glob("./fireabm_result/result/demo_quick_start20/1videos/*.mp4")[0]
HTML('<video width="300" controls><source src="%s" type="video/mp4"></video>'%rfile)
This example illustrated how to run a more complex custom program on the CyberGISX environment using CyberGIS-Compute.
Future examples will demonstrate how to run existing maintainers with custom data and how to run custom code that uses custom data.