Introduction to CyberGIS-Compute

Authors: Rebecca Vandewalle rcv3@illinois.edu, Zimo Xiao zimox2@illinois.edu, Furqan Baig fbaig@illinois.edu, and Anand Padmanabhan apadmana@illinois.edu
Last Updated: 10-10-21

CyberGIS-Compute enables students and researchers from diverse backgrounds to take advantage of High Performance Computing (HPC) resources without having to delve into the details of system setup, maintenance and management.

CyberGIS-Compute is designed to run on the CyberGISX platform, which uses Virtual ROGER (Resourcing Open Geospatial Education and Research), a geospatial supercomputer with access to a number of readily available popular geospatial libraries. A major goal of CyberGISX is to provide its users with straightforward access software tools, libraries and computational resources for reproducible geospatial research and education, so they may focus their efforts on research and application development activities instead of software installation and system engineering. CyberGIS-Compute provides a bridge between the regular capabilities of CyberGISX and powerful HPC resources so its users can leverage these computational resources for geospatial problem solving with minimal effort and a low learning curve.

In example 1: the hello world example, we will learn

  • The basics of the CyberGIS-Compute environment
  • The life cycle of a typical job in CyberGIS-Compute
  • How to run a simple predefined Hello World example on HPC using CyberGIS-Compute

In example 2: the custom code example, we will learn

  • How to connect to custom code to CyberGIS-Compute
  • How to run a job with custom (user supplied) code on HPC using CyberGIS-Compute

In example 3: the user interface example, we will learn

  • How to run a job using the user interface

In example 4: the evacuation example, we will learn

  • How to run a more complex job using CyberGIS-Compute

Prerequisites

To best understand this notebook, you should ideally have:

  • Familiarity with the Python programming language
  • Some experience working with Jupyter Notebooks
  • Slight familiarity with the Git repositories

Setup

Run the following code cell to load CyberGIS-Compute, the backend software development kit (SDK). This will allow us to be able to access and work with High Performance Computing (HPC) resources within CyberGISX.

Note: This cell will generate considerable text output.

In [ ]:
# Load the CyberGIS-Compute client

import sys
!{sys.executable} -m pip install --ignore-installed git+https://github.com/cybergis/job-supervisor-python-sdk.git@v2

Important!

After installing the client using the previous cell you must restart the Kernel! Then, run each code cell individually! The notebook will not work if you select Restart & Run All.

CyberGIS-Compute terminologies

Before we get to the example, it is helpful to introduce some key terms.

A typical High Performance Computing (HPC) job in CyberGISX using CyberGIS-Compute consists of the following major components:

  • CyberGIS-Compute
    • This is an entry point to the CyberGISX environment from a Python/Jupyter notebook. All interactions with the High Performance Computing (HPC) backend are performed using this object.
  • High Performance Computing (HPC) Resources
    • These are backend resources which typically require considerable effort to setup and maintain
    • The details of working with these resources are abstracted from users
    • These include a number of popular geospatial libraries
  • CyberGIS-Compute SDK (Software Development Kit)
    • The CyberGIS-Compute SDK provide an application programming interface (API) for creating the CyberGIS-compute objects for submitting computational tasks to HPC resources, monitoring such tasks and downloading results after the execution of the tasks on remote HPC resources"

Example 1: Hello World - Running Prepackaged Code using CyberGIS-Compute

To gain familiarity with the CyberGIS-Compute environment, we first present a simple Hello World example. We will learn how to write code to initialize and work with some of the CyberGIS-Compute components mentioned above.

Import the CyberGIS-Compute client

As mentioned earlier, every notebook using CyberGIS-Compute has to create a CyberGISCompute object to interact with the broader system. The following cell imports the required library from the Python Software Development Kit (SDK) that was downloaded and installed in the Setup section.

In [1]:
# Load CyberGIS-Compute client
from cybergis_compute_client import CyberGISCompute

Create a CyberGIS-Compute object

After importing CyberGIS-Compute, we first need to initialize a CyberGISCompute object. v2 specifies that we want to use the 2nd version.

In [2]:
# Create CyberGIS-Compute object
cybergis = CyberGISCompute(suffix='v2')

To work with CyberGIS-Compute we need to login. This helps track use of computing resources.

In [3]:
# Login
cybergis.login()
💻 Found system token
🎯 Logged in as beckvalle@cybergisx.cigi.illinois.edu

List GitHub repositories

In this example we are running a prebuilt job. You can see what jobs are available in the current environment using the list_git() function for the CyberGISCompute object we created above. Prebuilt jobs are stored using GitHub repositories. GitHub is common online storage place for code.

In [4]:
# Show available jobs
cybergis.list_git()
link name container repository commit
git://uncertainty_in_spatial_accessibilityUncertainty_in_Spatial_Accessibilitycybergisx-0.4https://github.com/JinwooParkGeographer/Uncertainty-in-Spatial-Accessibility.gitNONE
git://spatial_access_covid-19 COVID-19 spatial accessibility cybergisx-0.4https://github.com/cybergis/cybergis-compute-spatial-access-covid-19.git NONE
git://mpi_hello_world MPI Hello World mpich https://github.com/cybergis/cybergis-compute-mpi-helloworld.git NONE
git://hello_world hello world python https://github.com/cybergis/cybergis-compute-hello-world.git NONE
git://fireabm hello FireABM cybergisx-0.4https://github.com/cybergis/cybergis-compute-fireabm.git NONE
git://data_fusion data fusion python https://github.com/CarnivalBug/data_fusion.git NONE
git://cybergis-compute-modules-test modules test cjw-eb https://github.com/alexandermichels/cybergis-compute-modules-test.git NONE
git://bridge_hello_world hello world python https://github.com/cybergis/CyberGIS-Compute-Bridges-2.git NONE

Create an HPC job

In the next line, you will create a simple job object using the create_job() function. The variable that this function result is assigned to will be used for further interactions with the job.

In [5]:
# Create a job
demo_job = cybergis.create_job()
🎯 Logged in as beckvalle@cybergisx.cigi.illinois.edu
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346193d8oumkeeling_community {} null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:49:52.000Z

In the next line, you will create a simple Hello World job by setting the executableFolder value to the GitHub folder short link for the job. This job also expects a variable named a to be set. We will discuss the process of setting additional options more later.

In [6]:
# Set job options
demo_job.set(executableFolder="git://hello_world", param={"a": 32})
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346193d8oumkeeling_communitygit://hello_world {"a": 32}null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:49:52.000Z

Submit the HPC job

The next step is to submit the job created by the maintainer using the submit() function. Once a job is submitted, the code will be sent to and run on the selected High Performance Computing backend resources.

In [7]:
# Submit the job
demo_job.submit()
✅ job submitted
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346193d8oumkeeling_communitygit://hello_world {"a": 32}null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:49:52.000Z
Out[7]:
<cybergis_compute_client.Job.Job at 0x7fd7c0abd910>

View job events

Once you have submitted a job, there is a lot going on in the backend. Since CyberGIS-Compute is implemented as a public shared service, a submitted job may not necessarily be executed right away. Instead, all jobs are added to a queue and then scheduled one by one based on available resources. Once a job starts executing, it goes through several steps, such as:

  • Uploading the executable to the HPC backend
  • Creating directories for writing results (None for this example)
  • Running the executable
  • Getting the job results

All of these job stages can be observed in real time once a job is submitted using the events() function, with the liveOutput parameter set to True, which you can see in the following cell.

In [8]:
# View job events
demo_job.events(liveOutput=True)
📋 Job events:
📮 Job ID: 1635346193d8oum
🖥 HPC: keeling_community
types message time
JOB_QUEUED job [1635346193d8oum] is queued, waiting for registration 2021-10-27T14:49:52.000Z
JOB_REGISTEREDjob [1635346193d8oum] is registered with the supervisor, waiting for initialization2021-10-27T14:49:54.000Z
JOB_INIT job [1635346193d8oum] is initialized, waiting for job completion 2021-10-27T14:49:57.000Z
JOB_ENDED job [1635346193d8oum] finished 2021-10-27T14:50:00.000Z

View job logs

A submitted job usually goes through a regular life cycle, as mentioned in previous step. The status of a submitted job provides limited information about the current stage a job is in. Additional useful information can be retrieved through logs. Logs are very important when you debugging code for jobs that execute on remote resources. You can access logs for a job using the logs() function, with the liveOutput parameter set to True, as shown below.

For this Hello World example, when the job has been successfully submitted and executed, the log will display a message about what scripts are running and print the job parameters.

In [9]:
# View job logs
demo_job.logs(liveOutput=True)
📋 Job logs:
📮 Job ID: 1635346193d8oum
🖥 HPC: keeling_community
message time
running setup... running main... ./job.json SLURM_NODEID 0 SLURM_PROCID 0 job_id 1635346193d8oum param_a 32 {'job_id': '1635346193d8oum', 'user_id': 'beckvalle@cybergisx.cigi.illinois.edu', 'maintainer': 'community_contribution', 'hpc': 'keeling_community', 'param': {'a': 32}, 'env': {}, 'executable_folder': '/1635346193d8oum/executable', 'data_folder': '/1635346193d8oum/data', 'result_folder': '/1635346193d8oum/result'} running cleanup... 2021-10-27T14:50:00.000Z

This example illustrated the necessary steps to get started with CyberGIS-Compute on the CyberGISX environment. We went through the setup, basic terminologies and life cycle of a simple Hello World job.

In our next step, we will dive more into the details of customizing maintainers to be able to create and execute user specified code in CyberGISX.

Example 2: Setting up Custom Code using CyberGIS-Compute

In the previous example, we ran some code already written by a developer. However, you may want to change how the job runs by setting job parameters and variables.

In this example, we will further discuss how to set parameters and variables to customize how the job uses HPC resources on the CyberGISX environment using CyberGIS-Compute.

Creating a job

As above, first we need to we will create a job using the create_job() function.

In [10]:
# Create a custom job
custom_job = cybergis.create_job()
🎯 Logged in as beckvalle@cybergisx.cigi.illinois.edu
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346203fTD7Mkeeling_community {} null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:03.000Z

List HPC resources

You can list the HPC resources available in the current environment using the list_hpc() function for the CyberGISCompute object we created above. Right now we will be using the default keeling_community HPC backend, but you can change the hpc variable within the create_job() function in order to run the code on different computing resources.

In [11]:
# List available HPC resources
cybergis.list_hpc()
hpc ip port is_community_account
keeling_communitykeeling.earth.illinois.edu22 True
expanse_communitylogin.expanse.sdsc.edu 22 True
bridges_communitybridges2.psc.edu 22 True

Accessing custom code

Again we will need to specify a Git repository containing the code. For security reasons, only approved git repositories are allowed. The list of approved and available repositories can be listed by using the list_git() function, as shown in the following cell.

In [12]:
# List approved GitHub repositories
cybergis.list_git()
link name container repository commit
git://uncertainty_in_spatial_accessibilityUncertainty_in_Spatial_Accessibilitycybergisx-0.4https://github.com/JinwooParkGeographer/Uncertainty-in-Spatial-Accessibility.gitNONE
git://spatial_access_covid-19 COVID-19 spatial accessibility cybergisx-0.4https://github.com/cybergis/cybergis-compute-spatial-access-covid-19.git NONE
git://mpi_hello_world MPI Hello World mpich https://github.com/cybergis/cybergis-compute-mpi-helloworld.git NONE
git://hello_world hello world python https://github.com/cybergis/cybergis-compute-hello-world.git NONE
git://fireabm hello FireABM cybergisx-0.4https://github.com/cybergis/cybergis-compute-fireabm.git NONE
git://data_fusion data fusion python https://github.com/CarnivalBug/data_fusion.git NONE
git://cybergis-compute-modules-test modules test cjw-eb https://github.com/alexandermichels/cybergis-compute-modules-test.git NONE
git://bridge_hello_world hello world python https://github.com/cybergis/CyberGIS-Compute-Bridges-2.git NONE

As above, we will use git://hello_world which points to the https://github.com/cybergis/cybergis-compute-hello-world.git git repository.

Stages of execution

A job that runs using remote HPC resources can be divided into three primary stages: initialize, execute, and finalize. When we create custom code, we can specify what files should be run at each of these stages. The git://hello_world repository contains three python files, one for each stage.

  • Initialization Stage

    • In general the initialization stage will specify and setup a one time initial configuration for the job. For instance, code for this stage will set global variables, parse logic for data, etc. In the current example, code for this stage is specified in setup.py which simply prints a message running setup.
  • Execution Stage

    • The execution stage contains main logic for the job. Generally this will include the parallel and distributed logic to take advantage of the CyberGISX High Performance Computing backend. For instance, multi-threaded code, distributed logic, etc. In this example, this is specified in main.py which simply prints a message running main.
  • Finalization Stage

    • As the name suggests, the code in this stage is intended for tasks like clearing up job specific configurations etc. In this example, this is specified in setup.py which simply prints a message running cleanup.

Note that the only stage that needs to be set is the Execution State.

Configuring the manifest file

A code repository intended for use with CyberGIS-Compute needs to have a manifest.json file, which specifies which code to execute in the different job stages.

A typical manifest.json will have the following format

{
    "name": "hello world",
    "container": "python",
    "execution_stage": "python {{JOB_EXECUTABLE_PATH}}/main.py"
}

Note that {{JOB_EXECUTABLE_PATH}} will be replaced during the remote job submission process by the actual path needed to run the job.

This example is setup to run a Python script, however other commands can also be used such as "execution_stage": "bash run_job.sh".

Specifying the custom code repository

Now we need to specify the git repository containing the code for the job object. You can do this using the set() function and the executableFolder parameter. This job also needs a custom variable passed to this. You do this by passing a dictionary called param to the set() function. Note that we are passing a variable called a to CyberGISCompute. Within the job, this variable will be accessed using the name param_a.

In [13]:
# Specify GitHub repository
custom_job.set(executableFolder='git://hello_world', param={"a": "param a is set"})
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346203fTD7Mkeeling_communitygit://hello_world {"a": "param a is set"}null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:03.000Z

Submitting the job and tracking progress

Now we have everything setup to submit our new job with custom code specified at the configured Git repository. These last processes of submitting and tracking the job status work the same as in the first example.

In [14]:
# Submit custom job
custom_job.submit()
✅ job submitted
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346203fTD7Mkeeling_communitygit://hello_world {"a": "param a is set"}null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:03.000Z
Out[14]:
<cybergis_compute_client.Job.Job at 0x7fd7c0abd850>

You can get information about the job using the status() function as shown below.

In [15]:
# Check custom job status
custom_job.status()
Out[15]:
{'id': '1635346203fTD7M',
 'userId': 'beckvalle@cybergisx.cigi.illinois.edu',
 'secretToken': 'tkeJM0NOxIbcLiC7GMdUb0gYEevuz1FqkhK3okCGjf8CO',
 'maintainer': 'community_contribution',
 'hpc': 'keeling_community',
 'executableFolder': 'git://hello_world',
 'dataFolder': None,
 'resultFolder': None,
 'param': {'a': 'param a is set'},
 'env': {},
 'slurm': None,
 'createdAt': '2021-10-27T14:50:03.000Z',
 'updatedAt': '2021-10-27T14:50:03.000Z',
 'deletedAt': None,
 'initializedAt': None,
 'finishedAt': None,
 'isFailed': False,
 'logs': [],
 'events': [{'id': 492,
   'jobId': '1635346203fTD7M',
   'type': 'JOB_QUEUED',
   'message': 'job [1635346203fTD7M] is queued, waiting for registration',
   'createdAt': '2021-10-27T14:50:07.000Z',
   'updatedAt': '2021-10-27T14:50:07.000Z',
   'deletedAt': None}]}

Next you can track the job progress.

In [16]:
# Check custom job events
custom_job.events(liveOutput=True)
📋 Job events:
📮 Job ID: 1635346203fTD7M
🖥 HPC: keeling_community
types message time
JOB_QUEUED job [1635346203fTD7M] is queued, waiting for registration 2021-10-27T14:50:07.000Z
JOB_REGISTEREDjob [1635346203fTD7M] is registered with the supervisor, waiting for initialization2021-10-27T14:50:09.000Z
JOB_INIT job [1635346203fTD7M] is initialized, waiting for job completion 2021-10-27T14:50:11.000Z
JOB_ENDED job [1635346203fTD7M] finished 2021-10-27T14:50:15.000Z

As above, you can check the job output using the logs() function.

In [17]:
# Check custom job logs
custom_job.logs(liveOutput=True)
📋 Job logs:
📮 Job ID: 1635346203fTD7M
🖥 HPC: keeling_community
message time
running setup... running main... ./job.json SLURM_NODEID 0 SLURM_PROCID 0 job_id 1635346203fTD7M param_a param a is set {'job_id': '1635346203fTD7M', 'user_id': 'beckvalle@cybergisx.cigi.illinois.edu', 'maintainer': 'community_contribution', 'hpc': 'keeling_community', 'param': {'a': 'param a is set'}, 'env': {}, 'executable_folder': '/1635346203fTD7M/executable', 'data_folder': '/1635346203fTD7M/data', 'result_folder': '/1635346203fTD7M/result'} running cleanup... 2021-10-27T14:50:15.000Z

Finally, you can download the output and error messages using the downloadResultFolder() function. The zip folder will be placed in the same location as this notebook.

In [18]:
# create results folder to store downloaded results
import os
if not os.path.isdir("custom_result"):
    os.mkdir("custom_result")
In [19]:
# Download results from custom job
custom_job.downloadResultFolder('./custom_result')
file successfully downloaded under: ./custom_result/1635346209ZGWs.zip
Out[19]:
'./custom_result/1635346209ZGWs.zip'

This example illustrated the necessary steps to run custom code on the CyberGISX environment using CyberGIS-Compute. We went quickly through how to set up and specify a git repository that contains code and ran the custom job by specifying where to access the code.

Future examples will demonstrate how to run existing maintainers with custom data and how to run custom code that uses custom data.

Example 3: Using the Job Submission User Interface

We are working on a graphical job submission user interface to simplify the job submission process and help make sure the submitted options make sense for the submitted job. Although this interface is under active development, it still can be used to select and run jobs.

Try setting the Git Repository to git://cybergis-compute-modules-test.

In [20]:
cybergis.create_job_by_UI()

Slurm Options:

Click checkboxs to enable option and overwrite default config value. All configs are optional. Please refer to Slurm official documentation

Globus File Upload/Download:

Example 4: Using CyberGIS-Compute to run an Evacuation Computation

In this final example, we will use CyberGIS-Compute to run a more complex example based on an agent based model of evacuation.

Review available resources

Using the commands described above, we can review the available resources, maintainers, and custom code repositories.

In [21]:
# Review CyberGIS-Compute resources
cybergis.list_git()
link name container repository commit
git://uncertainty_in_spatial_accessibilityUncertainty_in_Spatial_Accessibilitycybergisx-0.4https://github.com/JinwooParkGeographer/Uncertainty-in-Spatial-Accessibility.gitNONE
git://spatial_access_covid-19 COVID-19 spatial accessibility cybergisx-0.4https://github.com/cybergis/cybergis-compute-spatial-access-covid-19.git NONE
git://mpi_hello_world MPI Hello World mpich https://github.com/cybergis/cybergis-compute-mpi-helloworld.git NONE
git://hello_world hello world python https://github.com/cybergis/cybergis-compute-hello-world.git NONE
git://fireabm hello FireABM cybergisx-0.4https://github.com/cybergis/cybergis-compute-fireabm.git NONE
git://data_fusion data fusion python https://github.com/CarnivalBug/data_fusion.git NONE
git://cybergis-compute-modules-test modules test cjw-eb https://github.com/alexandermichels/cybergis-compute-modules-test.git NONE
git://bridge_hello_world hello world python https://github.com/cybergis/CyberGIS-Compute-Bridges-2.git NONE

Next, as above, we create a job, set the GitHub repository to the evacuation repo, and submit the job.

Submit the job

In [22]:
# Create job, set GitHub repo, and submit job
demo_job = cybergis.create_job()
demo_job.set(executableFolder="git://fireabm", param={"start_value": 20}, 
             slurm = {"num_of_task": 2, "walltime": "10:00"})
demo_job.submit()
🎯 Logged in as beckvalle@cybergisx.cigi.illinois.edu
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346224qotAjkeeling_community {} null beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:24.000Z
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346224qotAjkeeling_communitygit://fireabm {"start_value": 20}{"num_of_task": 2, "walltime": "10:00"}beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:24.000Z
✅ job submitted
id hpc executableFolder dataFolder resultFolder param slurm userId maintainer createdAt
1635346224qotAjkeeling_communitygit://fireabm {"start_value": 20}{"num_of_task": 2, "walltime": "10:00"}beckvalle@cybergisx.cigi.illinois.educommunity_contribution2021-10-27T14:50:24.000Z
Out[22]:
<cybergis_compute_client.Job.Job at 0x7fd7c09a0490>

We can now view the job events as it is running.

In [23]:
# View job events
demo_job.events(liveOutput=True)
📋 Job events:
📮 Job ID: 1635346224qotAj
🖥 HPC: keeling_community
types message time
JOB_QUEUED job [1635346224qotAj] is queued, waiting for registration 2021-10-27T14:50:24.000Z
JOB_REGISTEREDjob [1635346224qotAj] is registered with the supervisor, waiting for initialization2021-10-27T14:50:27.000Z
JOB_INIT job [1635346224qotAj] is initialized, waiting for job completion 2021-10-27T14:50:30.000Z
JOB_ENDED job [1635346224qotAj] finished 2021-10-27T14:51:17.000Z

Once the job has finished, you can view logs. In this case, information about the running software is displayed.

In [24]:
# View job logs
demo_job.logs(liveOutput=True)
📋 Job logs:
📮 Job ID: 1635346224qotAj
🖥 HPC: keeling_community
message time
node id: 0, task id: 1, start number: 20, SEED: 21, result folder: /1635346224qotAj/result /1635346224qotAj/executable node id: 0, task id: 0, start number: 20, SEED: 20, result folder: /1635346224qotAj/result /1635346224qotAj/executable copying over files using FireABM_opt !! starting file parse at: 09:50:45 !! Working Directory: /1635346224qotAj/executable !! checking input parameters !! input parameters OK !! starting full run at 09:50:45 !! run simulation run params: 100% shortest ...[download for full log] 2021-10-27T14:51:17.000Z

Download results

Now we can save the completed evacuation computation results to a local folder and extract them from the zip file.

In [25]:
# create results folder to store downloaded results
import os
if not os.path.isdir("fireabm_result/result/"):
    os.makedirs("fireabm_result/result/")
In [26]:
# Save results to zip
result_zip = demo_job.downloadResultFolder('./fireabm_result')
file successfully downloaded under: ./fireabm_result/1635346227u7qn.zip
In [27]:
# Extract results
import os, zipfile
extract_results_to = "fireabm_result/result/"
with zipfile.ZipFile(result_zip, 'r') as zip_ref:
    zip_ref.extractall(extract_results_to)

Display results

Finally, we can visualize the results of the computation and plot the result.

In [28]:
# Display results from the job
import glob
from IPython.display import Video, HTML
rfile = glob.glob("./fireabm_result/result/demo_quick_start20/1videos/*.mp4")[0]
HTML('<video width="300" controls><source src="%s" type="video/mp4"></video>'%rfile)
Out[28]:

This example illustrated how to run a more complex custom program on the CyberGISX environment using CyberGIS-Compute.

Future examples will demonstrate how to run existing maintainers with custom data and how to run custom code that uses custom data.

In [ ]: