Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Sign in
Toggle navigation
Menu
Open sidebar
lasso
containers
run-lasso-o_shcu
Commits
d7e87dac
Commit
d7e87dac
authored
May 19, 2021
by
Carina Lansing
Browse files
Updated the shifter and singularity instructions and documentation in the run.sh file.
parent
63d120a0
Changes
3
Hide whitespace changes
Inline
Side-by-side
README-SHIFTER.md
View file @
d7e87dac
# LASSO-O Shifter Instructions
Follow these instructions if you will be running LASSO-O at NERSC.
Follow these instructions if you will be running LASSO-O at NERSC
or on an HPC cluster with a Shifter module available.
## Running LASSO-O via Shifter
Start LASSO-O via the
`run.sh`
script from the
**run-lasso-o_shcu**
folder:
## Setting up Shifter environment on your HPC cluster
If you are running on NERSC, shifter will be available in the
default environment. If you are running on an HPC
cluster other than NERSC, Shifter should be available via a module load.
To see if Shifter is available on your cluster, type the following:
```
bash
$
srun
-N
1
-n
1 ./run.sh
$
module avail
```
LASSO-O will take a while to run depending upon the number of simulations
you are processing.
The first time you run the lasso-o_shcu container, the container runtime
will download the container image, which will take a few minutes. Then,
LASSO-O will run a series of processes for each simulation provided as
input.
#### Load Shifter CLI on your HPC system
If you are not on NERSC,m use
`module load`
to load the Shifter
command-line client. For example:
```
bash
$
module load shifter/3.7.1
```
Once in your compute node shell with the Shifter module loaded,
you can run LASSO-O using the following instructions.
## Running LASSO-O via Shifter
You can start LASSO-O by executing the
`run.sh`
script via a scheduler command.
Below we provide examples for the Slurm and PBS schedulers. (The Slurm
scheduler is used at NERSC.)
These commands instruct the scheduler to run LASSO-O with one node
and one cpu. Currently, the LASSO-O container executes each process
in sequence. Later we will add the ability to support multiple cores.
Until then, LASSO-O will take a while to run depending upon the number of simulations
you are processing. In addition, the first time you run the lasso-o_shcu container,
the container runtime will download the container image, which will also take a few minutes.
<div
style=
"background-color: #F9F5D2; border: 1px solid grey; margin: 10px; padding: 10px;"
>
<strong>
NOTE:
</strong>
If you do not use a scheduler to invoke the script, it will run on the login node,
which may be killed per host policy if it runs for too long.
</div>
When your job has completed, you may view the outputs created in the
`run-lasso-o_shcu/data/outputs`
folder using the notebooks provided.
See the
[
notebooks/README.md
](
notebooks/README.md
)
file for more
information.
#### Start via Slurm
*
**Step 1: Run job**
```
bash
$
srun
--verbose
--nodes
=
1
--ntasks
=
1
--cpus-per-task
=
1 ./run.sh
```
*
**Step 2: Check job status**
To check the status of your job, list the queue for your user ID:
```
bash
$
squeue
-u
[
user_name]
```
You can also get more information about the running job using the scontrol command
and the jobID printed out from the squeue command:
```
bash
$
scontrol show jobid
-dd
[
jobID]
```
#### Start via PBS
PBS jobs must be started with a batch script:
*
**Step 1: Edit batch script**
To run LASSO-O, first edit
the
`pbs_sub.sh`
file to use the appropriate parameters for your
environment. In particular, the account name, group_list, and QoS
parameters should be definitely be changed, but other parameters may also be adjusted
as needed. In addition, the
`module load`
commands should be adjusted
to load the shifter module that is available at your cluster.
*
**Step 2: Submit job**
After you have edited the batch script, then you should be able to submit a batch job via:
```
bash
$
qsub pbs_sub.sh
-d
.
```
*
**Step 3: Check job status**
To check the status of your job, list the queue for your user ID:
```
bash
$
qstat
-u
[
user_name]
```
README-SINGULARITY.md
View file @
d7e87dac
...
...
@@ -52,41 +52,52 @@ information.
#### Start via Slurm
```
bash
$
srun
--verbose
--nodes
=
1
--ntasks
=
1
--cpus-per-task
=
1 ./run.sh
```
*
**Step 1: Run job**
##### Check job status
> To check the status of your job, list the queue for your user ID:
```
bash
$
srun
--verbose
--nodes
=
1
--ntasks
=
1
--cpus-per-task
=
1 ./run.sh
```
```
bash
$
squeue
-u
[
user_name]
```
*
**Step 2: Check job status**
> You can also get more information about the running job using the scontrol command
> and the jobID printed out from the squeue command:
To check the status of your job, list the queue for your user ID:
```
bash
$
scontrol show jobid
-dd
[
jobID]
```
```
bash
$
squeue
-u
[
user_name]
```
You can also get more information about the running job using the scontrol command
and the jobID printed out from the squeue command:
```
bash
$
scontrol show jobid
-dd
[
jobID]
```
#### Start via PBS
PBS jobs must be started with a batch script. To run LASSO-O, first edit
the
`pbs_sub.sh`
file to use the appropriate parameters for your
environment. In particular, the account name, group_list, and QoS
parameters should be changed, but other parameters may also be adjusted
as needed.
PBS jobs must be started with a batch script:
After you have edited the batch script, then you should be able to submit a batch job via:
*
**Step 1: Edit batch script**
```
bash
$
qsub pbs_sub.sh
```
To run LASSO-O, first edit
the
`pbs_sub.sh`
file to use the appropriate parameters for your
environment. In particular, the account name, group_list, and QoS
parameters should be definitely be changed, but other parameters may also be adjusted
as needed. In addition, the
`module load`
commands should be adjusted
to load the singularity module that is available at your cluster.
##### Check job status
> To check the status of your job, list the queue for your user ID:
*
**Step 2: Submit job**
```
bash
$
qstat
-u
[
user_name]
```
After you have edited the batch script, then you should be able to submit a batch job via:
```
bash
$
qsub pbs_sub.sh
-d
.
```
*
**Step 3: Check job status**
To check the status of your job, list the queue for your user ID:
```
bash
$
qstat
-u
[
user_name]
```
run.sh
View file @
d7e87dac
...
...
@@ -14,15 +14,15 @@ set -e
show_help
()
{
echo
""
echo
-e
"
$GREEN
--------------------------------------------------------------------------
$NC
"
echo
-e
"
$GREEN
This script helps you to run the LASSO-O container in either Docker
$NC
"
echo
-e
"
$GREEN
or
Singularity environments.
$NC
"
echo
-e
"
$GREEN
This script helps you to run the LASSO-O container in either Docker
,
$NC
"
echo
-e
"
$GREEN
Singularity
, or Shifter
environments.
$NC
"
echo
-e
"
$GREEN
--------------------------------------------------------------------------
$NC
"
echo
""
echo
"SYNTAX: ./run.sh [-h]"
echo
""
echo
"PREREQUISITES: "
echo
" 1) Make sure your Docker
or
Singularity environments
are
"
echo
" available. See README.md for more information on setting
"
echo
"PREREQUISITES:
"
echo
" 1) Make sure your Docker
,
Singularity
, or Shifter
environments "
echo
"
are
available. See README.md for more information on setting"
echo
" up your container runtime environment."
echo
""
echo
" 2) Make sure to configure the config.yml file with the "
...
...
@@ -101,22 +101,18 @@ run_singularity() {
}
run_shifter
()
{
# Note that NERSC id not have gitlab.com on their allowed registries, so they
# had to pull the image the first time in order to make this work. They
# are adding gitlab.com so that in the future, we can pull updates
# Note that it appears that shifter requires admins to pull the image
# into their cache in order to run. On NERSC, I had to submit a ticket
# for the image to be pulled, and it looks like we have to do the same thing on cumulus!
# Older versions of shifter don't support the --env parameter, so we are exporting this
# variable directly into the environment so it will be inherited by the shifter container.
export
BEGIN_DATETIME
=
$begin_datetime
shifter
\
--env
=
BEGIN_DATETIME
=
$begin_datetime
\
--volume
=
$input_folder
:/data/lasso/inputs
\
--volume
=
$output_folder
:/data/lasso/outputs
\
--image
=
docker:registry.gitlab.com/gov-doe-arm/docker/lasso-o_shcu
--
/apps/base/python3.6/bin/python /bin/run_lasso.py
shifter
\
--env
=
BEGIN_DATETIME
=
20180710.115900
\
--volume
=
/global/cscratch1/sd/carinal/test-in:/data/lasso/inputs
\
--volume
=
/global/cscratch1/sd/carinal/test-out:/data/lasso/outputs
\
--image
=
docker:registry.gitlab.com/gov-doe-arm/docker/lasso-o_shcu
--
/apps/base/python3.6/bin/python /bin/run_lasso.py
}
argument
=
"
$1
"
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment