Most of this notebook is going over advanced options and technical details behind our new design. There are however a few key things all users should know:
cjw
command to manage kernel versions.You may have noticed in the dropdown that for each type of kernel (TauDEM
, SUMMA
, etc.) there are two versions:
We are installing and managing software differently, so you should know that any hard-coded paths to executables have probably changed. The which
and whereis
commands are excellent tools for finding the new locations. If you need to know where the executable for a kernel is, you can use:
import sys
print(sys.executable)
For more details on how software is now managed, see the Modules section of this notebook.
cjw
command to manage kernel versions.¶We know the dropdown is currently crowded and as we release new versions of kernels will only get worse. While currently, we have all versions of kernels installed by default, this will not be the case in future. To ensure that you always have access to the kernels you use, we have provided a cjw
command line interface for kernel management. For more information on the cjw command, see that section in this notebook.
Kernels still work the same way, the conda environments that provide them have just changed locations. They are now installed on an NFS which is mounted in each container:
ls /data/cigi/cjw-easybuild/conda/
You may have noticed in the dropdown that for each type of kernel (TauDEM
, SUMMA
, etc.) there are two versions:
cjw
CLI¶As we acquire more and more kernels each release, we won't be displaying all of the kernels in the dropdown by default, so we have also provided a tool to see the available kernels and install old versions of the kernels. This command-line tool is cjw
.
cjw -h
cjw --avail
You'll notice there is an unversioned kernel of each type (for example python3
vs python3-2021-09
). This is the *-dev
version of the kernel and is continually updated to point at the latest release. This means that notebooks not saved with a versioned kernel will continue to open and by default open to the latest release.
For this release, we have all versions of the kernels installed, but let's remove a kernel and re-install just to show how the tool works! We will start by listing all of the kernels:
jupyter kernelspec list
Now, let's remove python3-2021-09 and list the kernels again:
jupyter kernelspec remove -f python3-2021-09
jupyter kernelspec list
You can see that kernel is no longer listed, so let's reinstall and list kernels again!
cjw -i python3-2021-09
jupyter kernelspec list
Once a kernel is installed with the cjw
command, it is added a personal kernel that will be available even if you restart the container!
cjw -p
To remove a kernel from your list of personal kernels you can use the -rp
or -remove-personal
flags (note that removing a kernel from your personal kernels doesn't remove it from Jupyter, but it will not be there upon restart of the container unless it is a default kernel):
cjw -rp python3-2021-09
cjw -p
If our kernels don't provide everything you need, you can still use your in-container pip
and conda
to install packages as needed.
pip install disjoint_set
pip show disjoint-set
python3 -c "import disjoint_set"
conda install -y affine
/opt/conda/bin/python3 -c "import affine"
You might be asking:
If the non-kernel software is installed on the NFS how do I interact with them in my notebook?
This is done using the prepend_and_launch.sh
script for each kernel. This script is generally available at ~/.local/share/juypter/kernels/<name of kernel>/prepend_and_launch.sh
. Note that the name of the kernel for that path is not necessarily the display name. Let's look at what we have in that path and see how the module is being loaded for this kernel!
cat ~/.local/share/jupyter/kernels/bash/prepend_and_launch.sh
The first line simply declares that it is a bash script. The third line, module load cjw/2021-19
is the line that loads the appropriate "metamodule" for your kernel (more on modules in the next section!). The last line, exec "$@"
just tells the kernel to run. The second and fourth lines set environment variables, but to understand them, first we need to look at the kernel.json
that declares the $MY_PROJ_LIB
and $MY_ENV_BIN_DIR
variables:
cat ~/.local/share/jupyter/kernels/bash/kernel.json
The kernel.json
defines the $MY_PROJ_LIB
and $MY_ENV_BIN_DIR
variables and tells the kernel to run the prepend_and_launch.sh
script before starting up the kernel! Another feature of this release is that kernels/conda environments are also installed outside of the kernel, meaning that we can keep old versions of kernels exactly as they were installed! Now we will look at what that module load
command actually means.
A key tool for this installation is Lmod. You may also be familiar with Environment Modules which is a very similar tool that Lmod was designed to improve on. Both of these tools are designed to help manage compute environments by giving users to ability to load
and unload
software. You may also find the User Guide for Lmod useful. If you are interested in a full list of modules commands/options, you can use the module --help
command:
module --help
To check what software is available, you can use the module avail
command:
module avail
You may have noticed that some of the modules have a "(L)" next to them. That's because we already loaded those modules for you when you started the kernel! Every kernel we provide will load the appropriate kernels so that users don't have to think about it. To see the software loaded you can use the command module list
:
module list
To get more information on any of those pieces of software here are a few simple commands you can try:
module whatis find_inlets
You can also search for stuff by keyword!
module keyword geo
Now that you know the basics of using modules, you're probably wondering: "how does this help me?" The answer is that modules are nice because loading and unloading allows you to automatically set/unset environment variables associated with software. So for example, let's check the $PATH
and $LD_LIBRARY_PATH
with the cjw module loaded:
module load cjw/2021-09
echo $PATH
echo $LD_LIBRARY_PATH
You can also unload the software if you want to. module unload <x>
with remove x
package or you can use module purge
to unload everything:
module purge
echo "Paths is $PATH"
echo "LD_LIBRARY_PATH is $LD_LIBRARY_PATH"
module load cjw/2021-09
While this may seem complex, all of this is taken care of for the user because of that module load
command in the "prepand_and_launch.sh" script that runs for each kernel! Next, we to answer what the cjw
module is.
We understand that even many advanced users will be overwhelmed by the large number of software installed. To simplify things, we have put together what we are calling "metamodules" which are modules that load a set of modules that provide a complete compute environment. The current release uses the cjw/2021-09
module. To get some information on what that does, you can use the module show <module of interest
command:
module show cjw/2021-09
The cjw/2021-09
module loads GRASS, MPICH, RHESSysEastCost, SUMMA, TauDEM, WRF, WPS, find_inlets, NCO and all of their dependencies. As you can see from the module list
command, all of the necessary software is right there for you to use!
module list
Since all of the software is already loaded for you, you can use the commands right here in the notebook:
# netcdf
nc-config --all
# netcdf-fortran
nf-config --all
# mpi
which mpirun
# summa
summa.exe --version