diff --git a/docs/README.md b/docs/README.md index f2a0d10..e5727ec 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,3 +1,3 @@ -# For future use +# Deprecated -May contain html version of the content at some point +Don't edit this repository. Work continued in https://github.com/csc-training/csc-env-eff diff --git a/hands-on/allas/allas-exercises.md b/hands-on/allas/allas-exercises.md index 428b4bb..c8ec767 100644 --- a/hands-on/allas/allas-exercises.md +++ b/hands-on/allas/allas-exercises.md @@ -1,4 +1,23 @@ -# Using Allas in CSC HPC environmnest +# Using Allas in CSC HPC environments + +Before the actual exercise, open the view to the Allas service in your browser using the cPouta WWW-interface. + +Open: https://pouta.csc.fi + +And login with your account. + +From the upper left corner, you find a project selection pop-up menu. If you have several projects available, select the +training project: **project_2002389** + +Then from the menu in left side of the interface, select: + +**Object Store -> Containers** + +During the exercises, you can use this interface to get another view to the buckets and objects in Allas. +Note that you need to **reload** the view in order to see the changes. + +======= + ## A. Log in Puhti and use scratch @@ -6,36 +25,31 @@ **Linux/mac** ```text -ssh XXXX@puhti.csc.fi (replace XXXX with your user account) +ssh XXXX@puhti.csc.fi (replace XXXX with your csc user account) ``` **Windows/PuTTY** **host:* puhti.csc.fi - **login as:** XXXX (replace XXXX with your account number) + **login as:** XXXX (replace XXXX with your csc user account) -In Puhti check you environment with command: +In Puhti check your environment with command: ```text csc-workspaces ``` Switch to the scratch directory of your project ```text -cd /scratch/project_2002389 +cd /scratch/project_2002389 # note! replace the text here (and below) with your project ``` -And create your own sub-directory, named after you training account: +And create your own sub-directory, named after your training account (if this directory does not yet exist): ```text mkdir XXXX ``` -(relace XXXX with your user account) +(replace XXXX with your user account) -Make the directory permissions such, that other group members can only read the contents but -not modify it -```text -chmod g-wx XXXX -``` -move to the new directory. +move to the directory. ```text cd XXXX ``` @@ -51,16 +65,18 @@ ls -ltr tree pythium ``` -## Using Allas +# Using Allas Open connection to Allas: ```text module load allas allas-conf ``` +If you have several Allas projects available, select the training project we are currently using. + ### Upload case 1. rclone -Upload the data from Puhti to Allas with rclone. Use the command below (replace XXXX with your user account): +Upload the data from Puhti to Allas with `rclone`. Use the command below (replace XXXX with your user account): ```text rclone -P copyto pythium allas:xxxx-genomes-rc/ ``` @@ -78,12 +94,12 @@ rclone lsf allas:xxxx-genomes-rc/ Check how this looks like in the Pouta web interface. Open browser and go to: [https://pouta.csc.fi/](https://pouta.csc.fi/) -In Pouta interface, go to _object store_ section, list the buckets (that are here called as “Containers”). +In Pouta interface, go to _object store_ section, list the buckets (which are here called as “Containers”). Locate your own _xxxx-genomes-rc_ directory and download one of the uploaded fasta files to your local computer. ### Upload case 2. a-put -Upload the pyhium directory from to Allas using following commands +Upload the pythium directory from Puhti to Allas using following commands (replace XXXX with your user account) A-put case 1: Store everything in one object: @@ -139,7 +155,7 @@ Try opening the public link that a-flip produced, with your browser. ## Upload case 3. Allas-backup Run commands: ```test -allas-backup –help +allas-backup -help allas-backup pythium allas-backup list ``` @@ -152,7 +168,8 @@ The data in pythium directory is now stored in many ways to Allas so we can remo rm -r pythium exit ``` -# C. Downloading data from Allas to Puhti +# Downloading data from Allas to Puhti + 1. Login to puhti.csc.fi and move to scratch: @@ -174,7 +191,7 @@ csc-workspaces ``` Go to your personal scratch directory of your project. ```text -cd /scratch/project_yourprojectnumber/trng_xxxx +cd /scratch/project_yourprojectnumber/xxxx ``` Set up Allas connection ```text @@ -215,9 +232,9 @@ mkdir vexans rclone copyto allas:xxxx-genomes-rc/pythium_vexans vexans/ ls -l vexans ``` -example 3: copy just one object +### example 3: copy just one object ```text -rclone copyto allas:trng_xxxx-genomes-rc/pythium_vexans/pythium_vexans.fasta \ ./vexans.fasta +rclone copyto allas:trng_xxxx-genomes-rc/pythium_vexans/pythium_vexans.fasta ./vexans.fasta ls -l ``` diff --git a/hands-on/allas/allas-mini-tutorial.md b/hands-on/allas/allas-mini-tutorial.md new file mode 100644 index 0000000..45e211a --- /dev/null +++ b/hands-on/allas/allas-mini-tutorial.md @@ -0,0 +1,108 @@ +# Allas Mini tutorial + +Open the view to the Allas service in your browser using the cPouta WWW-interface. + +Open: https://pouta.csc.fi + +And login with your account. + +From the upper left corner, you find a project selection pop-up menu. If you have several projects available, select the +training project: **project_2002389** + +Then from the menu in left side of the interface, select: + +**Object Store -> Containers** + +And create new container by pressin button: **+Container** + +Keep the container _Not public_ and name it as 2002389_xxxx ( replace xxxx with your user account). + +Open the new bucket (that is here calles as container) and upload one file from your computer. +Any file should do, but prefer a file that you can open in Puhti. + +During the exercises, you can use this interface to get another view to the buckets and objects in Allas. +Note that you need to **reload** the view in order to see the changes. + + +## Log in Puhti and use scratch + +1. Login to puhti.csc.fi and move to scratch: + +**Linux/mac** +```text +ssh XXXX@puhti.csc.fi (replace XXXX with your csc user account) +``` + +**Windows/PuTTY** + + **host:* puhti.csc.fi + + **login as:** XXXX (replace XXXX with your csc user account) + + +In Puhti check your environment with command: +```text +csc-workspaces +``` +Switch to the scratch directory of your project +```text +cd /scratch/project_2002389 # note! replace the text here (and below) with your project +``` +And create your own sub-directory, named after your training account (if this directory does not yet exist): +```text +mkdir XXXX +``` +(replace XXXX with your user account) + +move to the directory. +```text +cd XXXX +``` + +## Using Allas + +Open connection to Allas. +```text +module load allas +allas-conf +``` +If you have several Allas projects available, select the training project we are currently using. + +Study what you have in allas with commands +```text +a-list +rclone lsd allas: + +a-list 2002389_xxxx +rclone ls allas:2002389_xxxx +``` + +Download the file you just uploaded to Allas from your local computer. +You can do that in two ways (replace your-file-name with the name of the file you uploaded): +```text +a-get 2002389_xxxx/your-file-name +``` +or +``` +rclone copy allas:2002389_xxxx/your-file-name ./ +``` + +Upload the file back to Allas. + +Try commands: + +```text +a-put your-file-name +a-put --nc -b 2002389_xxxx +``` +Use use _a-put -h_ to fugure out the difference between the two commands above. + +Then do the upload with rclone: +```text +rclone copy your-file-name allas:2002389_xxxx/ +``` +Locate the files you just uploaded in Pouta www-interface. + + + + diff --git a/hands-on/batch_jobs/README.md b/hands-on/batch_jobs/README.md index fb2b2ab..2896b1e 100644 --- a/hands-on/batch_jobs/README.md +++ b/hands-on/batch_jobs/README.md @@ -1,9 +1,13 @@ # Batch jobs ## Tutorials -* [Batch job tutorial](batch_jobs_tutorial.md) on serial and parallel jobs. -* [Hands-on batch jobs in Puhti tutorial](https://docs.csc.fi/support/tutorials/cmdline-handson/) +* [Serial batch job tutorial](serial.md) start with this. +* [Parallel batch job tutorial](parallel.md) using MPI and/or OpenMP. +* [Interactive batch job tutorial](interactive.md) on Puhti. +* [Hands-on batch jobs in Puhti tutorial](https://docs.csc.fi/support/tutorials/cmdline-handson/) A longer set of jobs to run. ## Exercises +* [Retrieving bio data from repository](exercise_retrieving-bio-data.md) as an interactive job. (BIO) +* [Serial, array and parallel jobs with R + contours calculation from DEM with raster package (GIS) ](https://github.com/csc-training/geocomputing/tree/master/R/puhti) +* [Serial, array and parallel jobs with Python + NDVI calculation rasterio package (GIS) ](https://github.com/csc-training/geocomputing/tree/master/python/puhti) * ... - diff --git a/hands-on/batch_jobs/interactive.md b/hands-on/batch_jobs/interactive.md new file mode 100644 index 0000000..82196bc --- /dev/null +++ b/hands-on/batch_jobs/interactive.md @@ -0,0 +1,31 @@ +# Batch job tutorial - Interactive jobs + +- In this tutorial we'll get familiar with the basic usage of the Slurm batch queue system at CSC +- The goal is to learn how to request resources that **match** the needs of a job +- A job consists of two parts: resource requests and the job step(s) +- Examples are done on Puhti + +## Interactive jobs + +- In an interactive batch job, an interactive shell session is launced on a computing node. For heavy interactive tasks one can request specific resources (time, memory, cores, disk). + +- You can also use tools with graphical user interfaces in an interactive shell session. For such usage the [NoMachine](https://docs.csc.fi/support/tutorials/nomachine-usage/) remote desktop often provides an improved experience. + +### A simple interactive job + +- To start an interactive job using one core for ten minutes +```text +sinteractive --account myprojectname --time 00:10:00 +``` +- You should see that the command prompt (what's shown to the left to your cursor) has changed from _puhti-login1_ (or _puhti-login2_) to e.g. _r07c51_ +- Once on the compute node, you can run commands directly from the command line without `srun`, e.g. launch the (default) Python interpreter: +``` +module load python-env +python3 +``` +- Quit the Python interpreter with `quit()` +- This way you can work interactively for extended period, using lots of memory without creating load on the login nodes, which is forbidden in [the Usage Policy](https://docs.csc.fi/computing/overview/#usage-policy). +- Quit the interactive batch job with `exit`. Note, that above you asked only for 10 minutes of time. Once that is up, you will be automatically logged out from the compute node. From the command line prompt you can see whether you're in the compute node (e.g. _r07c51_) or back to the login node (e.g. _puhti-login2_). Giving `exit` in the login node, will log you out from Puhti. +- See the documetation at docs.csc.fi of [Interactive usage](https://docs.csc.fi/computing/running/interactive-usage/), for further information + +## Additional material [FAQ on CSC batch jobs ](https://docs.csc.fi/support/faq/#batch-jobs) in Docs CSC diff --git a/hands-on/batch_jobs/batch_jobs_tutorial.md b/hands-on/batch_jobs/parallel.md similarity index 55% rename from hands-on/batch_jobs/batch_jobs_tutorial.md rename to hands-on/batch_jobs/parallel.md index 645839b..8a3addb 100644 --- a/hands-on/batch_jobs/batch_jobs_tutorial.md +++ b/hands-on/batch_jobs/parallel.md @@ -1,39 +1,10 @@ -# Batch job tutorial +# Batch job tutorial - Parallel jobs - In this tutorial we'll get familiar with the basic usage of the Slurm batch queue system at CSC - The goal is to learn how to request resources that **match** the needs of a job - A job consists of two parts: resource requests and the job step(s) -- We'll go through examples for serial, parallel and interactive jobs - Examples are done on Puhti -## Serial jobs - -- For a program that can use only one core (cpu), one should request only one core from Slurm. -- The job doesn't benefit from additional cores, hence don't request more -- Excess reservation is waisted since it wouldn't be available to other users -- Within the job, the actual program is launched using the command `srun` - -```text -#!/bin/bash -#SBATCH --account=myprojectname -#SBATCH --partition=test -#SBATCH --ntasks=1 -#SBATCH --time=00:02:00 - -srun hostname -srun sleep 60 -``` -- In the batch job example above we are requesting one core (`--ntasks=1`) for two minutes (`--time=00:02:00`) from the test queue (`--partition=test`) -- We want to run the program `hostname`, that will print the name of the Puhti computing node that has been allocated for this particular job. -- In addition, we are running the `sleep` program to keep the job running for an additional 60 seconds, in order to have time to monitor the job -- Copy the example above into a file called `my_serial.bash` and change the `myprojectname` to the project you actually want to use -- Submit the job to the queue with the command `sbatch my_serial.bash` -- If you are quick enough you should see your job in the queue by issuing the command `squeue -u $USER` -- By default the output is written into a file named `slurm-XXXXXXX.out` where `XXXXXXX` is a unique number corresponding to the job ID of the job -- Check the efficiency of the job compared to the reserved resources by issuing the command `seff XXXXXXX` (replace `XXXXXXX` with the actual job ID number from the `slurm-XXXXXXX.out` file) -- You can get a list of all your jobs that are running or queuing with the command `squeue -u $USER` -- A submitted job can be cancelled using the command `scancel XXXXXXX` - ## Parallel jobs - A parallel program is capable of utilizing several cores and other resources simultaneously for the same job - The aim of a parallel program is to solve a problem (job) faster and to be able to tacle a larger problem that wouldn't fit into a single core @@ -44,8 +15,15 @@ srun sleep 60 ### A simple OpenMP job - An OpenMP enabled program can take advantage of multiple cores that share the same memory on a **single node** -- Dowload a simple OpenMP parallel program with the command `wget https://a3s.fi/hello_omp.x/hello_omp.x` -- Make it executable using the command `chmod +x hello_omp.x` +- Dowload a simple OpenMP parallel program with the +``` +wget https://a3s.fi/hello_omp.x/hello_omp.x +``` +- Make it executable using the command: +``` +chmod +x hello_omp.x +``` +- Copy the following example into a file called `my_parallel_omp.bash` and change the `myprojectname` to the project you actually want to use ```text #!/bin/bash @@ -61,11 +39,16 @@ srun hello_omp.x - We want to run the program `hello_omp.x`, that will be able to utilise four cores - The variable `OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK` tells the program that it can use four cores - Each of the four threads launced by `hello_omp.x` will print their own output -- Copy the example above into a file called `my_parallel_omp.bash` and change the `myprojectname` to the project you actually want to use -- Submit the job to the queue with the command `sbatch my_parallel_omp.bash` +- Submit the job to the queue with the command +``` +sbatch my_parallel_omp.bash +``` - When finished, the output file `slurm-XXXXXXX.out` should contain the results printed from the four OpenMP threads -- Check it with the `cat slurm-XXXXXXX.out` command: - +- Check it with +``` +cat slurm-XXXXXXX.out +``` +- The results should look like: ```text cat slurm-5118404.out Hello from thread: 0 @@ -78,7 +61,12 @@ Hello from thread: 1 - A MPI enabled program can take advantage of resourses that are spread over multiple nodes - Dowload a simple MPI parallel program with the command `wget https://a3s.fi/hello_mpi.x/hello_mpi.x` -- Make it executable using the command `chmod +x hello_mpi.x` +- Make it executable using the command +``` +chmod +x hello_mpi.x +``` + +- Copy the example below into a file called `my_parallel.bash` and change the `myprojectname` to the project you actually want to use ```text #!/bin/bash @@ -90,14 +78,18 @@ Hello from thread: 1 srun hello_mpi.x ``` - - In the batch job example above we are requesting resources from two nodes (`--nodes=2`), and four cores from each node (`--ntasks-per-node=4`) for ten seconds (`--time=00:00:10`) from the test queue (`--partition=test`) - We want to run the program `hello_mpi.x`, that will, based on the resource request, start 8 simultaneous tasks - Each of the 8 tasks launced by `hello_mpi.x` will report on which node they got their resource -- Copy the example above into a file called `my_parallel.bash` and change the `myprojectname` to the project you actually want to use -- Submit the job to the queue with the command `sbatch my_parallel.bash` +- Submit the job to the queue with the command +``` +sbatch my_parallel.bash +``` - When finished, the output file `slurm-XXXXXXX.out` should contain the results obtained by the `hello_mpi.x` program on how the 8 tasks were distributed over the two reserved nodes -- Check it with the `cat slurm-XXXXXXX.out` command: +- Check it with +``` +cat slurm-XXXXXXX.out +``` - **Note!** This example asks 4 cores from each of the 2 nodes. Normally, this would not make sense, but it would be better to run all 8 cores in the same node (in Puhti one node has 40 cores). Typically, you want your resources (cores) to be spread on as few nodes as possible. ```text cat slurm-5099873.out @@ -115,23 +107,4 @@ Hello world from node r07c02.bullx, rank 6 out of 8 tasks - You can get a list of all your jobs that are running or queuing with the command `squeue -u $USER` - A submitted job can be cancelled using the command `scancel XXXXXXX` -## Interactive jobs -- In an interactive batch job, an interactive shell session is launced on a computing node. For heavy interactive tasks one can request specific resources (time, memory, cores, disk). - -- You can also use tools with graphical user interfaces in an interactive shell session. For such usage the [NoMachine](https://docs.csc.fi/support/tutorials/nomachine-usage/) remote desktop often provides an improved experience. - -### A simple interactive job - -- To start an interactive job using one core for ten minutes -```text -sinteractive --account myprojectname --time 00:10:00 -``` -- See the documetation at docs.csc.fi of [Interactive usage](https://docs.csc.fi/computing/running/interactive-usage/), for further information - - -## Gathering information -- The `sinfo` command gives an overview of the partitions(queues) offered by the computer -- The `squeue` command shows the list of jobs which are currently queued (they are in the RUNNING state, noted as ‘R’) or waiting for resources (noted as ‘PD’, short for PENDING) -- the command `squeue -u $USER` lists your jobs - ## Additional material [FAQ on CSC batch jobs ](https://docs.csc.fi/support/faq/#batch-jobs) in Docs CSC diff --git a/hands-on/batch_jobs/serial.md b/hands-on/batch_jobs/serial.md new file mode 100644 index 0000000..d3d9965 --- /dev/null +++ b/hands-on/batch_jobs/serial.md @@ -0,0 +1,42 @@ +# Batch job tutorial - Serial jobs + +- In this tutorial we'll get familiar with the basic usage of the Slurm batch queue system at CSC +- The goal is to learn how to request resources that **match** the needs of a job +- A job consists of two parts: resource requests and the job step(s) +- Examples are done on Puhti + +## Serial jobs + +- For a program that can use only one core (cpu), one should request only one core from Slurm. +- The job doesn't benefit from additional cores, hence don't request more +- Excess reservation is wasted since it wouldn't be available to other users +- Within the job (or allocation), the actual program is launched using the command `srun` +- If you use a software that is preinstalled at CSC, please [check its infopage](https://docs.csc.fi/apps/): it might have a batch job template with useful default settings + +```text +#!/bin/bash +#SBATCH --account=myprojectname +#SBATCH --partition=test +#SBATCH --ntasks=1 +#SBATCH --time=00:02:00 + +srun hostname +srun sleep 60 +``` + +- In the batch job example above we are requesting one core (`--ntasks=1`) for two minutes (`--time=00:02:00`) from the test queue (`--partition=test`) +- We want to run the program `hostname`, that will print the name of the Puhti computing node that has been allocated for this particular job. +- In addition, we are running the `sleep` program to keep the job running for an additional 60 seconds, in order to have time to monitor the job +- Copy the example above into a file called `my_serial.bash` and change the `myprojectname` to the project you actually want to use (e.g. with `nano`) +- Submit the job to the queue with the command: +``` +sbatch my_serial.bash +``` +- If you are quick enough you should see your job in the queue by issuing the command `squeue -u $USER` +- By default the output is written into a file named `slurm-XXXXXXX.out` where `XXXXXXX` is a unique number corresponding to the job ID of the job +- Check the efficiency of the job compared to the reserved resources by issuing the command `seff XXXXXXX` (replace `XXXXXXX` with the actual job ID number from the `slurm-XXXXXXX.out` file) +- You can get a list of all your jobs that are running or queuing with the command `squeue -u $USER` +- A submitted job can be cancelled using the command `scancel XXXXXXX` + + +## Additional material [FAQ on CSC batch jobs ](https://docs.csc.fi/support/faq/#batch-jobs) in Docs CSC diff --git a/hands-on/connecting/README.md b/hands-on/connecting/README.md index 92b5e0a..9209157 100644 --- a/hands-on/connecting/README.md +++ b/hands-on/connecting/README.md @@ -2,8 +2,8 @@ ## Tutorials * [Login Puhti with ssh](ssh-puhti.md) -* [Login Puhti with NoMachine and run gnuplot](nomachine.md) +* [Login Puhti with NoMachine and run gnuplot](https://docs.csc.fi/support/tutorials/nomachine-usage/) ## Exercises -* [Search for a crystal structure with 6 fused benzene rings](mercury.md) -* more exercises... +* ... + diff --git a/hands-on/connecting/ssh-puhti.md b/hands-on/connecting/ssh-puhti.md index e7c667b..0f29e8b 100644 --- a/hands-on/connecting/ssh-puhti.md +++ b/hands-on/connecting/ssh-puhti.md @@ -12,23 +12,24 @@ On Windows 10, you can use the *Windows Power Shell* or [download Putty](https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html), or [download and install MobaXterm](https://mobaxterm.mobatek.net/download.html). -In this tutorial, we assume you use Windows Power Shell. More examples can be found +In this tutorial, we assume you use MobaXterm. [More examples can be found in docs](https://docs.csc.fi/computing/connecting/). -- Select Windows Power Shell from the applications list (opens from the windows logo) or search for it + +1. Launch MobaXterm from the applications list (opens from the windows logo) or search for it in the bottom bar search box. +2. "SSH" icon at top left +3. in the Basic SSH settings section Remote host field write "puhti.csc.fi" +4. Tick the "specify username" box and in the box write your csc username (leave port in the default setting 22). +5. Click "OK" at the bottom. +6. MobaXterm will now log you in puhti.csc.fi and ask you for your password. -- In the windows-blue terminal window type: -```bash -ssh @puhti.csc.fi -``` -- replace above `` with your actual csc username, or training account - which ever you're using, and press enter +* The next time you want to login to Puhti, just select it from the "session" menu on the left. ## MacOS In MacOS, you can use Terminal similarly as with Linux machines (see below). Simply open the Terminal application and type: ```bash -ssh @puhti.csc.fi +ssh yourcscusername@puhti.csc.fi ``` ## Linux @@ -36,7 +37,7 @@ ssh @puhti.csc.fi Laptops and workstations running Linux typically have SSH installed. Simply open a terminal and give: ```bash -ssh @puhti.csc.fi +ssh yourcscusername@puhti.csc.fi ``` - if you're connecting to Puhti (or that Puhti login node) for the first time, SSH will @@ -70,9 +71,9 @@ Last login: Mon Dec 14 14:53:15 2020 from jabadabaduu.fi ... └─────────────────────────────────────────────────────────────────────────────┘ -[@puhti-login1 ~]$ +[ yourcscusername@puhti-login1 ~]$ ``` Now, you're ready to go. Note, however, that remote graphics will not work. You could -add X11-tunneling to your ssh-connection, by adding `-X` or `-Y` to your command, and -[in Windows a separate X11-emulator](), but [for intesive remote graphics we recommend -using NoMachine](). +add X11-tunneling to your ssh-connection, by adding `-X` or `-Y` to your command, while +in Windows MobaXterm actually will tunnel the connection by default. However, for +intensive remote graphics we recommend using NoMachine. diff --git a/hands-on/linux_prerequisites/README.md b/hands-on/linux_prerequisites/README.md new file mode 100644 index 0000000..a882cfb --- /dev/null +++ b/hands-on/linux_prerequisites/README.md @@ -0,0 +1,10 @@ +# Linux prerequisites + +## Tutorials +* [Login Puhti with ssh](csc-env-eff/hands-on/connecting/ssh-puhti.md) +* [Basic Linux commands](basic-linux-commands.md) +* [Basics of file editors](basic-file-editing.md) + + +## Exercises +* TBA \ No newline at end of file diff --git a/hands-on/linux_prerequisites/basic-file-editing.md b/hands-on/linux_prerequisites/basic-file-editing.md new file mode 100644 index 0000000..8a3beaf --- /dev/null +++ b/hands-on/linux_prerequisites/basic-file-editing.md @@ -0,0 +1,31 @@ +# Basic file editing + +This tutorial requires that you have a [user account at CSC](https://docs.csc.fi/accounts/how-to-create-new-user-account/) +and it is a member of a project that [has access to Puhti service](https://docs.csc.fi/accounts/how-to-add-service-access-for-project/). +You have also already [logged to Puhti with ssh](ssh-puhti.md), and [learned the basic Linux commands](basic-linux-commands.md). + +We downloaded a file called my-first-file.sh, made a copy of it (yourname-first-file.sh), and now we practise how to edit it! + +These exercises are done with `nano` editor, but you can use your favorite editor too. +Here's the [nano cheet sheat](https://www.nano-editor.org/dist/latest/cheatsheet.html) + +1. Open the file with `nano`: +```bash +nano yourname-first-file.sh +``` + +2. Let's edit the file. Type something there! + +3. Let's save the file (Ctrl + S) and exit `nano` (Ctrl + X). Type Y to confirm saving. + +4. Check that the modifications are actually there: +```bash +less yourname-first-file.sh +``` +(Exit from preview with q.) + +5. You can also create files with `nano`. Try simply: +```bash +nano yourname-cool-note-file.sh +``` +Type something there as well, save, and close. diff --git a/hands-on/linux_prerequisites/basic-linux-commands.md b/hands-on/linux_prerequisites/basic-linux-commands.md new file mode 100644 index 0000000..251fabf --- /dev/null +++ b/hands-on/linux_prerequisites/basic-linux-commands.md @@ -0,0 +1,54 @@ +# Basic Linux commands + +This tutorial requires that you have a [user account at CSC](https://docs.csc.fi/accounts/how-to-create-new-user-account/) +and it is a member of a project that [has access to Puhti service](https://docs.csc.fi/accounts/how-to-add-service-access-for-project/). +You have also already [logged to Puhti with ssh](ssh-puhti.md). + + +1. Now that you have logged in Puhti, let's check in which folder we are in! Type pwd and hit Enter. +```bash +pwd +``` + +2. Are there any files there? +```bash +ls +``` + +3. Let's make a directory (replace YourName with your name, for example: MariasTestFolder)! Try `ls` to see if the folder is now there! +```bash +mkdir YourNameTestFolder +ls +``` + +4. Go to that folder. Note, that if you just type `cd` and the first letter of the folder name, then hit 'tab' key, the terminal completes the name. Handy! +```bash +cd YourNameTestFolder +``` + +5. Let's download a file into this new folder. `wget` is the command for downloading from URL. +```bash +wget https://github.com/CSCfi/csc-env-eff/raw/master/hands-on/linux_prerequisites/my-first-file.sh +``` + +6. What kind of file did you get? What's in that file now? What size is it? Let's use `ls` command with some extra parameters, and `less` command to check out how the file looks like. +```bash +ls -lth +less my-first-file.sh +``` +To exit the `less` preview of the file, hit 'q'. + +7. Let's make a copy of this file (again, replace YourName with your name). +```bash +cp my-first-file.sh YourName-first-file.sh +ls -lth +less YourName-first-file.sh +``` + +8. Let's remove the file we originally downloaded (leave your own copy). +```bash +rm my-first-file.sh +ls +``` + +Next, let's learn [how to edit that file](basic-file-editing.md)! diff --git a/hands-on/linux_prerequisites/my-first-file.sh b/hands-on/linux_prerequisites/my-first-file.sh new file mode 100644 index 0000000..84633c7 --- /dev/null +++ b/hands-on/linux_prerequisites/my-first-file.sh @@ -0,0 +1,4 @@ +# Hi there! Good job! +# This is a test text file. You can play around with this file. + +# You can try typing here! diff --git a/hands-on/modules/README.md b/hands-on/modules/README.md index bc78f61..fdbd8e9 100644 --- a/hands-on/modules/README.md +++ b/hands-on/modules/README.md @@ -2,8 +2,7 @@ ## Tutorials * [Using modules in Puhti](modules-puhti.md) -* add tutorials here... ## Exercises * [Biosoftwares in Puhti](module-exercise-with-aligners.md) -* more exercises... \ No newline at end of file +* More exercises TBA diff --git a/hands-on/modules/module-exercise-with-aligners.md b/hands-on/modules/module-exercise-with-aligners.md index 52ca8d3..b436e95 100644 --- a/hands-on/modules/module-exercise-with-aligners.md +++ b/hands-on/modules/module-exercise-with-aligners.md @@ -32,7 +32,7 @@ Was HISAT2 available in the biokit? ## Bioconda environment -4. After aligning, we might want to check the quality of the alignment with RSeQC tool. As we can see from the `module list` command above, it was not included in the biokit. Like we learned, you can try to look for it from the application manual page and by using the `module spider reseqc`. +4. After aligning, we might want to check the quality of the alignment with RSeQC tool. As we can see from the `module list` command above, it was not included in the biokit. Like we learned, you can try to look for it from the application manual page and by using the `module spider rseqc`. No luck? What next? Let's take a look at the bioconda environment. @@ -43,13 +43,14 @@ Let's check what is available with spider again, and load one of the modules: module spider bioconda module load bioconda/3 ``` -Take a look at the message you get. Note, that some dependency modules were re-loaded in the background. It says that we first need to set the PROJAPPL environment variable. -To do so, run command (you can check the name/number of your project(s) with command `csc-workspaces`) +Take a look at the message you get. Note, that some dependency modules were re-loaded in the background. It also says that we first need to set the PROJAPPL environment variable. +To do so, run command (you can check the name/number of your project(s) with command `csc-workspaces`): ```bash export PROJAPPL=/projappl/project_XXXXXXX ``` -Check which applications are available in this bioconda environment: +Re-run the ```module load``` command and then check which applications are available in this bioconda environment: ```bash +module load bioconda/3 conda env list ``` See RSeQc there? diff --git a/hands-on/modules/modules-puhti.md b/hands-on/modules/modules-puhti.md index f4c49e1..a5930f4 100644 --- a/hands-on/modules/modules-puhti.md +++ b/hands-on/modules/modules-puhti.md @@ -4,7 +4,7 @@ This tutorial requires that you have a [user account at CSC](https://docs.csc.fi and it is a member of a project that [has access to Puhti service](https://docs.csc.fi/accounts/how-to-add-service-access-for-project/). -1. Log in to Puhti with your user credentials. +1. Log in to Puhti with your user credentials. (Replace with your CSC username, withouth the < > brackets!) ```bash ssh @puhti.csc.fi ``` @@ -14,7 +14,7 @@ ssh @puhti.csc.fi module list ``` -3. Check what versions are available for Gromacs (note, that this might take a while, as the command searches through all the available modules): +3. Check what versions are available for Gromacs. (Note, that this might take a while, as the command searches through all the available modules. The list can be long, you can go to next line with Enter, or stop viewing by typing ```q```). ```bash module spider gromacs ``` @@ -24,14 +24,13 @@ module spider gromacs module avail gromacs-env ``` -5. Load the gromacs-env module, and check the loaded modules list again. +5. Load the gromacs-env module, and check the loaded modules list again. Do you notice any changes? ```bash module load gromacs-env module list ``` -Do you notice any changes? -6. Switch to the GPU version of Gromacs, and check the situation. +6. Switch to the GPU version of Gromacs, and check the situation. Do you notice any changes? ```bash module load gromacs-env/2020-gpu module list diff --git a/hands-on/singularity/singularity-tutorial.md b/hands-on/singularity/singularity-tutorial.md index 80908f2..16b0309 100644 --- a/hands-on/singularity/singularity-tutorial.md +++ b/hands-on/singularity/singularity-tutorial.md @@ -40,9 +40,31 @@ The first command is run in the host. The second command is run inside the conta The tutorial container is based on Ubuntu 18.04. The host and the container use the same kernel, but the rest of the system can vary. That means a container can be based -on a different Linux distro than the host (as long as they the are kernel compatible), +on a different Linux distribution than the host (as long as they the are kernel compatible), but can't run a totally differen OS like Windows. +`Singularity exec` is the run method you will typically use in a batch job script. + +Make a file called `test.sh`, and copy the following contents to it. Change "project_xxxx" +to the corrct project name. +```text +#!/bin/bash +#SBATCH --job-name=test +#SBATCH --account=project_xxxx +#SBATCH --partition=test +#SBATCH --time=00:01:00 +#SBATCH --mem=1G +#SBATCH --ntasks=1 +#SBATCH --cpus-per-task=1 + +singularity exec tutorial.sif hello_world +``` +Submit the job to the queue with: +```text +sbatch test.sh +``` +For more information on batch jobs, please see [CSC Docs pages](https://docs.csc.fi/computing/running/getting-started/). + ### 2. Singularity run When containers are created, a standard action, called the `runscript` is defined. Depending on the container, it may simply print out a message, or it may launch a program @@ -90,7 +112,7 @@ container. This is done with command line argument `--bind` (or `-B`). The basic syntax is `--bind /path/in/host:/path/inside/container`. -``` + The bind path does not need to exist inside the container. It is created if necessary. More that one bind pair can be specified. The option is available for all the run methods described above. @@ -100,15 +122,15 @@ from inside the container without bind: ```text export SCRATCH=/scratch/project_12345 -singularity exec tutorial.sif ls +singularity exec tutorial.sif ls $SCRATCH ``` This will not work. The container can not see the host diectory, so you will get a -`No such file or directory` error. +"No such file or directory" error. Now try binding host directory `/scratch` to directory `/scratch` inside the container. ```text -singularity exec --bind /scrath:/scratch tutorial.sif ls $SCRATCH +singularity exec --bind /scratch:/scratch tutorial.sif ls $SCRATCH ``` This time the host directory is linked to to the container directory and the command works. @@ -131,7 +153,7 @@ image file. export SING_IMAGE=$PWD/tutorial.sif singularity_wrapper exec ls $SCRATCH ``` -Since some modules set `$SING_IMAGE` when loaded it is a good idea to start with +Since some modules set `$SING_IMAGE` when loaded, it is a good idea to start with `module purge` if you plan to use it, to make sure correct image is used. ## Environment variables @@ -139,8 +161,8 @@ Some software may reguire some environment variables to be set, e.g. to point to reference data or a configuration file. Most environment variables set on the host are inherited by the container. Sometimes -this may be undesirable. With command line option --cleanenv host environment is not -inherited by the conatiner. +this may be undesirable. With command line option `--cleanenv` host environment is not +inherited by the container. To set an environment variable specifically inside the container, you can set an environment variable `$SINGULARITYENV_xxx` (where xxx is the variable name) on the host @@ -149,7 +171,7 @@ before invoking the container. Set some test variables: ```text export TEST1="value1" -export SINGULARITYENV_TEST2="value" +export SINGULARITYENV_TEST2="value2" ``` Compare the outputs of: @@ -158,16 +180,22 @@ env |grep TEST singularity exec tutorial.sif env |grep TEST singularity exec --cleanenv tutorial.sif env |grep TEST ``` -The first command is run on host and we see `$TEST1` and `$SINGULARITYENV_TEST2`. The +The first command is run on host, and we see `$TEST1` and `$SINGULARITYENV_TEST2`. The second command is run in the container and we see `$TEST1` (inherited from host) and -`$TEST2` (specifically set inside the container by setting `$SINGULARITYENV_TEST2`on host). +`$TEST2` (specifically set inside the container by setting `$SINGULARITYENV_TEST2` on host). The third command is also run inside the container, but this time we omitted host environment variables, so we only see `$TEST2`. +It should be noted that any variables on command line are substituted by their values on the host. +```text +singularity exec tutorial.sif echo $TEST2 +``` +This will result in empty output because $TEST2 has not been set on host. + ## Exploring containers Our test container includes program `hello2`, but it is not in the `$PATH`. One way to -find it is to try running `find`inside the container +find it is to try running `find` inside the container ```text singularity exec tutorial.sif find / -type f -name "hello2" 2>/dev/null @@ -190,27 +218,45 @@ Instead, you will have to import a ready image file. There are various option to do this. -### Pull an existing Singularity container from a repository -Use `singularity pull`: +### 1. Run or pull an existing Singularity container from a repository +It is possible to run containers directly from repository: ```text -singularity pull shub://vsoch/hello-world +singularity run shub://vsoch/hello-world:latest ``` +This can, however, lead to a batch job failing if there are network problems. +Usually it is preferable to pull the container first and use the image file. +```text +singularity pull shub://vsoch/hello-world:latest +singularity run hello-world_latest.sif +``` + +### 2. Convert an existing Docker container to Singularity -### Convert an existing Docker container to Singularity -Use `singularity build`: +Docker images are downloaded as layers. These layers are stored in the cache directory. +Default location for this is `$HOME/.singularity/cache`. Since the home directory has +limited capacity, and some images can be large, it's best to set `$SINGULARITY_CACHE` +to point to some other location with more space. + +If running with `sinteractive`, or as batch job on an IO node, you can use the +fast local storage: ```text -singularity build pytorch_20.03-py3.sif docker://nvcr.io/nvidia/pytorch:20.03-py3 +export SINGULARITY_TMPDIR=$LOCAL_SCRATCH +export SINGULARITY_CACHEDIR=$LOCAL_SCRATCH ``` -Unlike Singularity images that are downloaded as as single file, Docker images are -downloaded as layers. These layers are stored in the cache directory. Default location -for this is `$HOME/.singularity/.cache`. Since the home directory has limited capacity, -and some images can be large, it's best to set `$SINGULARITY_CACHE` to point to some -other location with more space. +If running on a node with no local storage, you can use e.g. /scratch. +You can avoid some unnecessary warnings by unsetting a variable: +```text +unset XDG_RUNTIME_DIR +``` +You can now run `singularity build`: +```text +singularity build alpine.sif docker://library/alpine:latest +``` You can find more detailed instructions for converting Docker images in Docs CSC: [Running existing containers](https://docs.csc.fi/computing/containers/run-existing/) -### Build the image on another system and transfer the image file to Puhti +### 3. Build the image on another system and transfer the image file to Puhti To do this you will need an access to system where you have root access and that has Singularity installed. diff --git a/hands-on/singularity/singularity_extra_creating-containers.md b/hands-on/singularity/singularity_extra_creating-containers.md index 2978599..edd0921 100644 --- a/hands-on/singularity/singularity_extra_creating-containers.md +++ b/hands-on/singularity/singularity_extra_creating-containers.md @@ -4,22 +4,22 @@ This is an extra exercise. It can not be run in Puhti. You will need access to a computer or a virtual machine where you have sudo rights and that has Singularity 3.x installed. -In this tutorial we create a Singularity container and install the same software +In this tutorial we create a Singularity container, and install the same software as we installed in the tutorial -[Installing a simple C code from source](..\installing\installing_hands-on_c.md) +[Installing a simple C code from source](..\installing\installing_hands-on_c.md). You can see that tutorial for more information on the installation commands. In this tutorial we only cover the basics. Detailed instructions can be found -in the [Singularity manual](https://sylabs.io/guides/3.7/user-guide/.) +in the [Singularity manual](https://sylabs.io/guides/3.7/user-guide). -# Sandbox mode +## Sandbox mode One way to create a Singularity container is to do it in so-called sandbox -mode. Instead of an read-only image file, we create a directory structure +mode. Instead of an image file, we create a directory structure representing the file system of the container. -To start with we create a basic container from a definition file. To choose +To start with, we create a basic container from a definition file. To choose a suitable Linux distribution, you should check the documentation of the software you wish to install. If the the developers provide installation intructions for a specific distribution, it is usually easiest to start with that. @@ -36,14 +36,14 @@ OSVersion: 7 MirrorURL: http://mirror.centos.org/centos-%{OSVERSION}/%{OSVERSION}/os/$basearch/ Include: yum ``` -We the use that definition file to build teh container: +We then use that definition file to build the container: ```text sudo singularity build --sandbox mcl centos.def ``` Note that instead of an image file, we created a directory called `mcl`. If you need to include some reference files etc, you can copy them to correct subfolder. -We can then open a shell in that container. We need the container file system +We can then open a shell in the container. We need the container file system to be writable, so we include option `--writable`: ```text sudo singularity shell --writable mcl @@ -52,15 +52,23 @@ The command prompt should now be `singularity>` If there is a need to make the container as small as possible, we should only install the dependencies we need. Usually the size is not that critical, so we may -opt more for ease of use. In this case we install application group "Development -tools" that includes most of the components we need (C, C++, make), but also -things we don't need in this case. +opt more for ease of use. + +In this case we install application group "Development tools" that includes +most of the components we need (C, C++, make), but also a lot of things we +don't actually need in this case. + +Notice that unlike in the CSC machine, we are able to use the package mangement +tools (in this case `yum`). This will often make installing libraries and other +dendencies easier. + +Also notice that it is not necessary to use `sudo` inside the container. ```text yum group install "Development Tools" -y yum install wget -y ``` -We are now ready to install. +We are now ready to install the software. Download and extract the distribuition package: ```text @@ -92,7 +100,7 @@ rm -rf mcl-* ``` We can also add a runscript: ```text -echo '"$@"' >> /singularity +echo 'exec /bin/bash "$@"' >> /singularity ``` We can now exit the container: ```text @@ -114,8 +122,9 @@ singularity exec mcl.sif mcl --version The above method is applicable as-is if you intend the container to be only used by you and your close collaborators. However, if you plan to distribute it wider, it's best to write -a definition file for it. That way the other users can, if -they so choose, rebuild the production image. +a definition file for it. That way the other users can see +what is in the container and they can, if they so choose, easily +rebuild the production image. A definition file will also make it easier to modify and re-use the container later. For example, software update can often be done @@ -171,7 +180,7 @@ Include: yum export LC_ALL=C %runscript - exec "$@" + exec /bin/bash "$@" ``` In more complex cases it often helpful to first build the image in diff --git a/hands-on/singularity/singularity_extra_replicating-conda.md b/hands-on/singularity/singularity_extra_replicating-conda.md index a630f93..f13fbb2 100644 --- a/hands-on/singularity/singularity_extra_replicating-conda.md +++ b/hands-on/singularity/singularity_extra_replicating-conda.md @@ -8,7 +8,7 @@ Singularity 3.x installed. Conda is a usefull tool for installing software with complex dependencies. It has, however, some problems, especially on systems like Puhti. -The main problems are related to storage: Conda environments can be quite large +The main problems are related to storage: Conda environments can be quite large, and can have tens of thousands of files. Just 3-4 environments are enough to fill the basic quota of project's /projapp directory. @@ -16,7 +16,7 @@ Conda environments can also be somewhat sensitive to changes in the base system, meaning that e.g. updates in Puhti can sometimes break Conda environments, necessitating a re-install. -Using a Singularity container instead can help with bot problems: Singularity +Using a Singularity container instead can help with both problems: Singularity containers are just single file that is typically smaller than the total size of the Conda environment directory. They are also less sensitive for changes in host system. @@ -36,7 +36,7 @@ You can find more detailed instructions for converting Docker images in Docs CSC ## Replicating existing Conda environment -If you have an existing Conda environment, you can save `environmet.yml`file and +If you have an existing Conda environment, you can save `environmet.yml` file and use it to replicate the environment. Please note that the `environment.yml` file will only reflect changes to environment @@ -76,7 +76,7 @@ From: continuumio/miniconda3 %runscript exec "$@" ``` -Make sure files `environment.yml`and `conda_environment.def` are in the +Make sure files `environment.yml` and `conda_environment.def` are in the current directory and give command: ```text @@ -120,4 +120,7 @@ In this case there would even better option: Building from a ready container: singularity build fastx.sif docker://biocontainers/fastx-toolkit:v0.0.14-6-deb_cv1 ``` - Good: This can be done with user right in Puhti and you end up with a single 61 MB file. -- Bad: Finding a ready, working container may take some time. \ No newline at end of file +- Bad: Finding a ready, working container may take some time. + +Containers are not a "silver bullet" solution to all installation problems, but they can +be in many cases a preferable alternative for conda. \ No newline at end of file diff --git a/slides/00_study_tips.md b/slides/00_study_tips.md index 018d785..6e180a5 100644 --- a/slides/00_study_tips.md +++ b/slides/00_study_tips.md @@ -2,13 +2,13 @@ theme: csc-2019 lang: en --- -# Study tips for self learning {.title} +# Study tips and problem solving {.title} # Using these materials - The material is organized by topics in increasing complexity - Feel free to jump if you know the basics already -- Read the slides / watch the video first +- Read the slides / watch the video (to appear) first - Complete the tutorial to make sure you've got the steps right - Try out one or more of the exercises to verify your new skill - If you get stuck, consult [the docs](https://docs.csc.fi) linked to the topic slides @@ -16,41 +16,74 @@ lang: en # General problem solving -1. Try in [docs.csc.fi](https://docs.csc.fi) in the right section in the *hierarchy* +1. Try looking in [docs.csc.fi](https://docs.csc.fi) in the right section in the *hierarchy* 2. Try in the [FAQ](https://docs.csc.fi/support/faq/) 3. Try the search in docs or google for it + - Start typing a keyword in docs, Copy/paste the error message in google 4. Send an email to [servicedesk@csc.fi](mailto:servicedesk@csc.fi) containing: - A descriptive title - What you wanted to achieve, and on which which computer - Which commands you had given - What error messages resulted - - [More tips to help us quickly solve your issue](https://docs.csc.fi/support/support-howto/)) + - [More tips to help us quickly solve your issue](https://docs.csc.fi/support/support-howto/) -# Learning a new method or application +# Running a new application in Puhti 1/2 - If it comes with tutorials, do at least one - This will likely be the fastest way forward -- Check if there's a page for it in [docs.csc.fi/apps/](https://docs.csc.fi/apps/) + - Naturally, read the manual / instructions +- Check if there's a page for it in [docs CSC](https://docs.csc.fi/apps/) - If there is, use the batch script example from _there_ -- First try a small / quick job and when you're sure it works, scale up + - Otherwise, use a general template +- Try first running interactively (**not** in login node) + - Perhaps easier to find the correct command line options + - Use `top` command to get rough estimate on memory use etc. + - If developers provide some test or example data, run it first and make sure results are correct + +# Running a new application in Puhti 2/2 + +- You can use *test* queue to check your batch job script correctness + - Limits : 15 min, 2 nodes + - Job turnaround usually very fast even if machine "full" + - Can be useful to spot typos, missing files etc. before submitting a job that will spend long in the queue +- Before large runs, it’s a good idea to do a smaller trial run + - Check that results are as expected + - Check resource usage after test run and adjust accordingly +- How many cores to allocate? + - This depends on many things so you must try, see our [instructions on a scaling test](https://docs.csc.fi/support/tutorials/cmdline-handson/#scaling-test-for-an-mpi-parallel-job) + + +# What if your job fails? Troubleshooting checklist 1/2 + + 1. Did the job run out of time? + 2. Did the job run out of memory? + 3. Did the job actually use resources you specified? + * Problems in batch job script can cause parameters to be ignored and default values are used instead + 4. Did it fail immediately or did it run for some time? + * Jobs failing immediately are often due to something simple like typos in command line, missing inputs, bad parameters etc. + +# What if your job fails? Troubleshooting checklist 2/2 + + 5. Check the error file captured by batch job script + 6. Check any other error files and logs the program may have produced + 7. Error messaged can sometimes be long, cryptic and a bit intimidating, but ... + * Try skimming through them and see if you can spot something ”human readable” instead of ”nerd readable” + * Often you can spot the actual problem easily if you go through the whole message. Something like ”required input file so-and-so missing” or ”parameter X out of range” etc. + 8. Consult the [FAQ on common Slurm issues](https://docs.csc.fi/support/faq/why-does-my-batch-job-fail/) in docs.csc.fi # Document your discoveries - When you've successfully solved an issue, make it easy to rediscover it -- Set up a file in your `$HOME` and add your commands there with keywords for yourself - - e.g. it's quick to copy/paste your command from the screen to the end of the file +- Set up a file in your `$HOME` and add your commands there + - It's quick to copy/paste from the screen to the end of the file ```bash cat >> $HOME/vault -Ctlr-C +Ctrl-C ``` -- and `grep`'ing it later is quick +- ... and finding _them_ with `grep` later is quick (`grep them $HOME/vault`) + - `bash` history is nice, but keeps also the ones that didn't work... + - Note, don't overwrite your vault file (_e.g._ with `cat > $HOME/vault`) - Store scripts in `$HOME/bin` and take backups - -# Keep notes - -Save useful commands and short scripts to a file for later reference, and push -your notes to GitHub every now and then. Bash history keeps all the commands -that you typed, but it also keeps the ones that did not work... diff --git a/slides/01_environment.md b/slides/01_environment.md new file mode 100644 index 0000000..f0f31b1 --- /dev/null +++ b/slides/01_environment.md @@ -0,0 +1,55 @@ +--- +theme: csc-2019 +lang: en +--- + +# Brief introduction to HPC environments {.title} + +# Some notes on vocabulary +
+- computer ~= node +- processor ~= socket +- core~= CPU +
+
+ +
+ +# Cluster systems +
+- Login nodes are used to set up the jobs +- Jobs are run in the compute nodes +- A batch job system (aka scheduler) is used to run and manage the jobs + - On CSC machines we use Slurm + - Other common systems include SGE and Torque/PBS + - Syntax is different, but basic operation is similar +
+
+ +
+ +# Planning jobs +- What kind of recources can your application use? + - Can it use more than one core? + - How much memory it will need? + - Can it use GPU? +- See what kind of resources are available + - Each system is different, so check the documentation + +# Things to check +- What kind of nodes are available? + - Number of cores + - Size of memory + - Extra hardware, *e.g.* GPU, fast local storage +- What partitions (queues) are available + - Job sizes, max run time, etc + - Provisioning policy + - Per core/per node/other + +# Available HPC resources + +Check CSC Docs pages for information on available resources + + - [Puhti technical details](https://docs.csc.fi/computing/systems-puhti/) + - [Mahti technical details](https://docs.csc.fi/computing/systems-mahti/) + - [Available partitions](https://docs.csc.fi/computing/running/batch-job-partitions/) \ No newline at end of file diff --git a/slides/02_logging_in.md b/slides/02_logging_in.md index 9ddafb9..0579449 100644 --- a/slides/02_logging_in.md +++ b/slides/02_logging_in.md @@ -11,16 +11,29 @@ In this section, you will learn how to login on CSC supercomputers with ssh and - SSH is a terminal program that will give you access to the command line on the CSC supercomputer - It is the versatile main interface to a supercomputer -- Please read this page for an introduction on how to login with ssh: https://docs.csc.fi/computing/connecting/ +- Please read this page for an introduction on [how to login with ssh](https://docs.csc.fi/computing/connecting/) + - Mac and Linux come with ssh, Windows Powershell can be used, but we recommend applications like MobaXterm, Putty, CMDer - Note the [prerequisites to be able to access Puhti](https://docs.csc.fi/support/faq/how-to-get-puhti-access/) +- Plain ssh will not allow displaying remote graphics + - It can be enabled by tunneling, but on Windows it will require additional installations, see the link above. # Log in via NoMachine -- NoMachine is a software that makes remote graphics easier, like running a graphical version of the R software. -- NoMachine client must be installed locally on your computer first (May require admin privileges). +- NoMachine is a software that makes remote graphics easier, like using a graphical user interface (GUI) + - Note, [R software](https://docs.csc.fi/apps/r-env-singularity/) and [Jupyter Notebooks](https://docs.csc.fi/computing/running/interactive-usage/#example-running-a-jupyter-notebook-server-via-sinteractive) have a better way via client server approach +- NoMachine client must be installed locally on your computer first (This may require admin privileges). - [The client is used to connect to a server in Kajaani](https://docs.csc.fi/apps/nomachine/) and it provides faster graphical performance than X11-forwarding - Please first consult the [Instructions on how to install and to use NoMachine](https://docs.csc.fi/support/tutorials/nomachine-usage/) +# Moving files between local computer and Puhti + +- [scp](https://docs.csc.fi/data/moving/scp/) and [rsync](https://docs.csc.fi/data/moving/rsync/) are powerful command line tools to copy files + - scp works even in Windows Powershell (but rsync is missing) + - e.g. `scp filename cscusername@puhti.csc.fi:/scratch/project_xxxx` +- Sometimes a [GUI tool for transfering files](https://docs.csc.fi/data/moving/graphical_transfer/) is more convenient + - Nice tools are e.g. FileZilla and WinSCP + - Installing such tools may require Admin privileges + # Advanced topic: Setting up SSH-keys - Using SSH-keys is easier and safer than using password with every login. diff --git a/slides/07_allas.md b/slides/07_allas.md index 17a99c1..fe25caa 100644 --- a/slides/07_allas.md +++ b/slides/07_allas.md @@ -1,12 +1,14 @@ --- -theme: Allas storage service +theme: csc-2019 lang: en --- +# Allas object storage service {.title} # How to get access to Allas Use [https://my.csc.fi](https://my.csc.fi) to + 1. Register to CSC (haka) 2. Set up a project at CSC (Principal Investigator) 3. Apply for Puhti and Allas service, quota and billing units for your project @@ -17,20 +19,26 @@ All project members have equal access to the data in Puhti and Allas. # Allas – object storage: what it is for? -* Allas is new storage service for all computing and cloud services +* Allas is a new storage service for all computing and cloud services * CEPH based object storage * Meant for data during project lifetime -*  Default quota 10 TB / Project. +* Default quota 10 TB / Project. * Possible to upload data from personal laptops or organizational storage systems into Allas * Clients available in Puhti and Mahti * Data can also be shared via Internet # Allas – object storage: what it is for? +
+ +
* Data can be moved to and from Allas directly without using Puhti or Mahti. -* For the computation the data has to be typically copied to a file system in some computer +* For computation the data has to be typically copied to a file system in some computer * Data can be shared publicly to Internet, which is otherwise not easily possible at CSC. - +
+
+!["Allas"](img/allas.png "Allas"){width=90%} +
# Allas – object storage: what it is NOT @@ -40,67 +48,67 @@ All project members have equal access to the data in Puhti and Allas. # Allas - storage -* An object is stored in multiple servers so a disk or server break does not cause data loss. +* An object is stored in multiple servers so a disk or server break does not cause data loss. * There is no backup i.e. if a file is deleted, it cannot be recovered -* Data cannot be modified while it is in the object storage – data is immutable. -* Rich set of data management features to be built on top of it. -* Usage thrpough S3 and Swift APIs supported +* Data cannot be modified while it is in the object storage – data is immutable. +* Rich set of data management features are to be built on top of it. +* Usage through S3 and Swift APIs are supported # Allas – object storage: terminology * Storage space in Allas is provided per **CSC project** * Project space can have multiple *buckets* ( up to 1000) -* Only one level of hierarchy of buckets (no buckets within buckets) +* There is only one level of hierarchy of buckets (no buckets within buckets) * Data is stored as **objects** within a bucket * Objects can contain can contain any type of data (generally, object = file) -* In Allas you can have 500 000 objetcs / bucket +* In Allas you can have 500 000 objects / bucket * Name of the bucket must be unique within Allas * Objects have metadata that can be enriched -* In reality, there is no hierarcical dirctory structure, even tough it sometimes looks like that. +* In reality, there is no hierarcical directory structure, although it sometimes looks like that. -# Allas supports Two Protocols +# Allas supports two protocols * S3 (used by: s3cmd) * Swift (used by: swift, rclone, a-tools, cyberduck) * Authentication is different -* S3: permanent key based authentication – nice, easy and unsecure + * S3: permanent key based authentication – nice, easy and unsecure * Swift: authentication based on temporary tokens – more secure, requires authentication every 8 hours -* Metadata is handled in different ways -* Over 5G files are managed in different ways + * Metadata is handled in different ways + * Over 5G files are managed in different ways * → **Avoid cross-using Swift and S3 based objects!** # Allas Clients: read, write, delete **Puhti, Mahti, Linux servers, Macs:** -* rclone, switft, s3cdm, a-tools +* rclone, swift, s3cdm, a-tools **Virtual machines, small servers:** -* in addition to the tools above you can use FUSE based virtual mounts +* In addition to the tools above, you can use FUSE based virtual mounts **Laptops (Windows, Mac):** * Cyberduck, FileZilla(pro), pouta-www interface - +FIXME: links to these / detailed instructions? # Allas – first steps for Puhti * Use [https://my.csc.fi](https://my.csc.fi) to apply Allas access for your project – Allas is not automatically available * In Puhti and Mahti, setup connection to Allas with commands: -```text +```bash module load allas allas-conf ``` -* Study the manual and Start using Allas with rclone or a-tools:[https://docs.csc.fi/data/Allas/](https://docs.csc.fi/data/Allas/) +* Study the manual and [Start using Allas with rclone or a-tools instructions](https://docs.csc.fi/data/Allas/) # Allas – rclone -* Straight-forward power-user tool with wide range of features +* Straight-forward power-user tool with wide range of features. * Fast and effective. * Available for Linux, Mac and windows. * Overwrites and removes data without asking! * The default configuration at CSC uses swift-protocol but S3 can be used too. -Use with care: [https://docs.csc.fi/#data/Allas/using_allas/rclone/](https://docs.csc.fi/#data/Allas/using_allas/rclone/) +Use with care: [rclone instructions in Docs CSC](https://docs.csc.fi/#data/Allas/using_allas/rclone/) # Allas – a-tools @@ -111,33 +119,33 @@ Use with care: [https://docs.csc.fi/#data/Allas/using_allas/rclone/](https://doc * Unlike rclone, a-tools do not overwrite and remove data without asking! * Automatic packing and compression. * Default bucket names based on directories of Puhti -* [https://docs.csc.fi/#data/Allas/using_allas/a_commands/](https://docs.csc.fi/#data/Allas/using_allas/a_commands/) +* [a-commands instructions in Docs CSC](https://docs.csc.fi/#data/Allas/using_allas/a_commands/) # A-put/a-get: pros and cons -+ saving data as a tar package preserves time stamps, accession settings, and internal links of the directory. -+ zstdmt compression reduces size -+ default bucket name and metadata reflect the directory sturctures of Puhti and Mahti -+ checks to prevent over writing data accidentally ++ Saving data as a tar package preserves time stamps, accession settings, and internal links of the directory. ++ `zstdmt` compression reduces size ++ Default bucket name and metadata reflect the directory sturctures of Puhti and Mahti ++ Checks to prevent over writing data accidentally -- usage of objects, created by a-put can be complicated when other object storage tools are used -- ecpecially windows is problematic -- each object has additional _ameta object +- Usage of objects, created by `a-put` can be complicated when other object storage tools are used +- Especially usage from windows is problematic +- Each object has an additional _ameta object # Allas problems * 8 hour connection limit with swift * No way to check quota -* Movin data inside Allas is not possible (swift) -* No way to freeze data ( use two projects if needed). -* Diffrent interfaces may work in diffrent ways +* Moving data inside Allas is not possible (swift) +* No way to freeze data (use two projects if needed). +* Different interfaces may work in diffrent ways # Things that users should consider * Should I store files as one object or as bigger chunks? * Should I use compression? -* Who can use the data: Projects and accession permissions ? +* Who can use the data: Projects and accession permissions? * What will happen to my data later on? * How to keep track of all the data I have in Allas? diff --git a/slides/09_singularity.md b/slides/09_singularity.md index bab6725..41bd473 100644 --- a/slides/09_singularity.md +++ b/slides/09_singularity.md @@ -13,6 +13,15 @@ and how to use them in CSC environment. - Popular container engines include Docker, Singularity, Shifter - Singularity is the most popular in HPC environments +# Containers vs. Virtual Machines (1/2) +
+ +# Containers vs. Virtual Machines (2/2) +- Virtual machines can run totally different OS than host +(e.g. Windows on Linux host or vice versa) +- Containers share kernel with host, but can have its own libraries etc + - Can run e.g. different Linux distribution than host + # Container benefits: Ease of installation - Containers are becoming a popular way to distribute software - Single command installation @@ -30,8 +39,8 @@ and how to use them in CSC environment. # Container benefits: Enviroment reproducibility - Analysis environment can be saved as a whole - - Usefull with e.g. Python, where updating underlaying - libraries (Numpy etc) can lead to differences + - Useful with e.g. Python, where updating underlaying + libraries (Numpy etc) can lead to differences in behavior - Sharing with collaborators easy (single file) # Singularity in a nutshell @@ -46,19 +55,18 @@ and how to use them in CSC environment. - Running Docker directly would require root rights # Singularity on CSC servers -- Singularity installed only in compute nodes -- Singularity jobs need to run as batch jobs or with `sinteractive` +- Singularity jobs should be run as batch jobs or with `sinteractive` - No need to load a module - Users can run their own containers - Some CSC software installations provided as containers - See software pages for details # Running Singularity containers: Basic syntax +- Execute a command in the container + - `singularity exec [exec options...] ` - Run the default action (runscript) of the container - Defined when the container is built - `singularity run [run options...] ` -- Execute a command in the container - - `singularity exec [exec options...] ` - Open a shell in the container - `singularity shell [shell options...] ` @@ -80,7 +88,7 @@ setting in host `$SINGULARITYENV_variablename`. - E.g. to set `$TEST` in container, set `$SINGUALRITYENV_TEST` in host # singularity_wrapper -- Running containers with singularity_wrapper takes care of most common `--bind` commands +- Running containers with `singularity_wrapper` takes care of most common `--bind` commands - `singularity_wrapper exec image.sif myprog ` - If environment variable `$SING_IMAGE` is set with the path to the image, even image file can be omitted - `singularity_wrapper exec myprog ` @@ -108,16 +116,23 @@ otherwise problematic: - Singularity: 1 file, total size 339 MB - Containers are not the solution for everything, but they do have their uses… -# Building a new Singularity container -- Typical steps - - Build a basic container in sandbox mode (`--sandbox`) +# Building a new Singularity container (1/3) +- Requires root access: Can not be done directly in e.g. Puhti + +- 1. Build a basic container in sandbox mode (`--sandbox`) - Uses a folder structure instead of an image file - Requires root access! - - Open a shell in the container and install software - - Depending on base image system, package managers can be used to install libraries and dependencies (apt install, yum install etc) - - Installation as per software developer instructions - - Build a production image from the sandbox - - (optional) Make a definition file and build a productio image from it - - Mostly necesary if you wish to distribute your container wider. - -Requires root access: Can not be done directly in e.g. Puhti \ No newline at end of file + +# Building a new Singularity container (2/3) +- 2. Open a shell in the container and install software + - Depending on base image system, package managers can be used to install + libraries and dependencies (`apt install` , `yum install` etc) + - Installation as per software developer instructions + +# Building a new Singularity container (3/3) +- 3. Build a production image from the sandbox +- (optional) Make a definition file and build a production image from it + - Mostly necesary if you wish to distribute your container wider + - Also helps with updating and re-using containers +- Production image can be transferred to e.g. Puhti and run with user rights + diff --git a/slides/img/allas-nextcloud.png b/slides/img/allas-nextcloud.png new file mode 100644 index 0000000..a043af5 Binary files /dev/null and b/slides/img/allas-nextcloud.png differ diff --git a/slides/img/allas-p-put.png b/slides/img/allas-p-put.png new file mode 100644 index 0000000..e77150c Binary files /dev/null and b/slides/img/allas-p-put.png differ diff --git a/slides/img/allas-projects.png b/slides/img/allas-projects.png new file mode 100644 index 0000000..c5f4f17 Binary files /dev/null and b/slides/img/allas-projects.png differ diff --git a/slides/img/allas-rclone.png b/slides/img/allas-rclone.png new file mode 100644 index 0000000..e543ee9 Binary files /dev/null and b/slides/img/allas-rclone.png differ diff --git a/slides/img/allas.png b/slides/img/allas.png new file mode 100644 index 0000000..3c08afb Binary files /dev/null and b/slides/img/allas.png differ diff --git a/slides/img/allas.svg b/slides/img/allas.svg new file mode 100644 index 0000000..5858cf5 --- /dev/null +++ b/slides/img/allas.svg @@ -0,0 +1,4518 @@ + + + + + + + + + + image/svg+xml + + + + + + + + + diff --git a/slides/img/array-and-greasy_1.png b/slides/img/array-and-greasy_1.png new file mode 100644 index 0000000..ecf976e Binary files /dev/null and b/slides/img/array-and-greasy_1.png differ diff --git a/slides/img/array-and-greasy_2.png b/slides/img/array-and-greasy_2.png new file mode 100644 index 0000000..100db3f Binary files /dev/null and b/slides/img/array-and-greasy_2.png differ diff --git a/slides/img/array-and-greasy_3.png b/slides/img/array-and-greasy_3.png new file mode 100644 index 0000000..8888fca Binary files /dev/null and b/slides/img/array-and-greasy_3.png differ diff --git a/slides/img/array-and-greasy_4.png b/slides/img/array-and-greasy_4.png new file mode 100644 index 0000000..dc9331a Binary files /dev/null and b/slides/img/array-and-greasy_4.png differ diff --git a/slides/img/array-and-greasy_5.png b/slides/img/array-and-greasy_5.png new file mode 100644 index 0000000..4d55256 Binary files /dev/null and b/slides/img/array-and-greasy_5.png differ diff --git a/slides/img/array-and-greasy_6.png b/slides/img/array-and-greasy_6.png new file mode 100644 index 0000000..9a625e0 Binary files /dev/null and b/slides/img/array-and-greasy_6.png differ diff --git a/slides/img/array-and-greasy_7.png b/slides/img/array-and-greasy_7.png new file mode 100644 index 0000000..3d82544 Binary files /dev/null and b/slides/img/array-and-greasy_7.png differ diff --git a/slides/img/cluster.svg b/slides/img/cluster.svg new file mode 100644 index 0000000..6b44a32 --- /dev/null +++ b/slides/img/cluster.svg @@ -0,0 +1,2626 @@ + + + + + + + + image/svg+xml + + + + + + + + + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + Socket + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + Socket + + Memory + + Local storage(not on all nodes + Node + + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + Socket + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + Socket + + Memory + + Local storage(not on all nodes) + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + Node + + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + Socket + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + Socket + + Memory + + Local storage(not on all nodes) + Node + + + Storage (Lustre) + Login nodes + Compute nodes + + + + diff --git a/slides/img/containers-fig1.png b/slides/img/containers-fig1.png new file mode 100644 index 0000000..5f9f9d8 Binary files /dev/null and b/slides/img/containers-fig1.png differ diff --git a/slides/img/node.svg b/slides/img/node.svg new file mode 100644 index 0000000..d13dabd --- /dev/null +++ b/slides/img/node.svg @@ -0,0 +1,775 @@ + + + + + + + + image/svg+xml + + + + + + + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + Socket + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + + + + CPU + + + + + CPU + + + + + + CPU + + + + + + CPU + + + + Socket + + Memory + + Local storage(not on all nodes) + Node + + +