README.md 10.3 KB
Newer Older
Willi Rath's avatar
Willi Rath committed
1
# How to work with JupyterLab everywhere
Willi Rath's avatar
Willi Rath committed
2

Willi Rath's avatar
Willi Rath committed
3
This guide **will** walk you through the complete setup of JupyterLab and one or more scientific Python computing environments on your local **Linux** or **MacOSX** computer or on a remote **Linux/Unix** machine.  This guide **will not** explain how to use the shell, how to connect to remote machines via SSH, or how to make sure you have access to the relevant networks, e.g., via VPN.
Willi Rath's avatar
Willi Rath committed
4

Willi Rath's avatar
Willi Rath committed
5
## Quick start
Willi Rath's avatar
Willi Rath committed
6

Willi Rath's avatar
Willi Rath committed
7
Clone this repository (only necessary once)
8
```bash
Willi Rath's avatar
Willi Rath committed
9
git clone https://git.geomar.de/python/jupyter_on_HPC_setup_guide.git
10
```
Willi Rath's avatar
Willi Rath committed
11 12 13 14
run
```bash
jupyter_on_HPC_setup_guide/scripts/remote_jupyter_manager.sh
```
Willi Rath's avatar
Willi Rath committed
15
and follow the instructions given. It installs a (hopefully) complete Python/Jupyterlab environment on any machine, and starts up Jupyterlab so you can start working immediately.
Willi Rath's avatar
Willi Rath committed
16

Willi Rath's avatar
Willi Rath committed
17 18
---

Willi Rath's avatar
Willi Rath committed
19
Please _**continue reading**_ to learn how to specify which software to use and to get a more detailed understanding about your computing environment.
Willi Rath's avatar
Willi Rath committed
20

Willi Rath's avatar
Willi Rath committed
21

Willi Rath's avatar
Willi Rath committed
22
## Install the base environment and JupyterLab
Willi Rath's avatar
Willi Rath committed
23

Willi Rath's avatar
Willi Rath committed
24
This will show how to download the installer, install a minimal Python environment, and start JupyerLab on any **Linux/Unix** or **MacOSX** system.
Willi Rath's avatar
Willi Rath committed
25

Willi Rath's avatar
Willi Rath committed
26
Execute the steps below on the computer where the calculation should be performed (e.g., your local computer or a computer in a remote computing centre).
Willi Rath's avatar
Willi Rath committed
27 28


Willi Rath's avatar
Willi Rath committed
29
### Download, install and initialize Miniconda3 (only necessary once)
Willi Rath's avatar
Willi Rath committed
30

Willi Rath's avatar
Willi Rath committed
31
_As Python 2 won't be supported beyond the end of 2019, please make sure to **always use Python3** in all following steps._
Willi Rath's avatar
Willi Rath committed
32

Willi Rath's avatar
Willi Rath committed
33
Download latest Miniconda3 on **Linux/Unix**:
Willi Rath's avatar
Willi Rath committed
34
```bash
Willi Rath's avatar
Willi Rath committed
35
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -o Miniconda3.sh
Willi Rath's avatar
Willi Rath committed
36 37
```

Willi Rath's avatar
Willi Rath committed
38 39
Or download the latest Miniconda3 on **MacOSX**:
```bash
Willi Rath's avatar
Willi Rath committed
40
curl https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh -o Miniconda3.sh
Willi Rath's avatar
Willi Rath committed
41
```
Willi Rath's avatar
Willi Rath committed
42

Willi Rath's avatar
Willi Rath committed
43
Then, to install Miniconda3 to your Home directory, initialize conda and make sure that you don't override any Python installation from your operating system, run:
44
```bash
Willi Rath's avatar
Willi Rath committed
45
bash Miniconda3.sh -b -p ${HOME}/miniconda3
Willi Rath's avatar
Willi Rath committed
46 47
${HOME}/miniconda3/bin/conda init bash
${HOME}/miniconda3/bin/conda config --set auto_activate_base false
48
```
Willi Rath's avatar
Willi Rath committed
49 50

After installation and initialization, log out and log back in again.
Willi Rath's avatar
Willi Rath committed
51

Willi Rath's avatar
Willi Rath committed
52

Willi Rath's avatar
Willi Rath committed
53
### Install JupyterLab to the base environment
Willi Rath's avatar
Willi Rath committed
54 55

Before continuing make sure you are in a bash shell by typing
Willi Rath's avatar
Willi Rath committed
56
```bash
Willi Rath's avatar
Willi Rath committed
57
echo $SHELL | grep bash || bash
Willi Rath's avatar
Willi Rath committed
58 59
```

Willi Rath's avatar
Willi Rath committed
60
Then (and in the same terminal), install `jupyterlab` and `nb_conda_kernels`, to the `base` environment by running:
Willi Rath's avatar
Willi Rath committed
61
```bash
Willi Rath's avatar
Willi Rath committed
62
conda install -n base jupyterlab nb_conda_kernels
Willi Rath's avatar
Willi Rath committed
63 64
```

Willi Rath's avatar
Willi Rath committed
65
### Add a scientific Python environment
Willi Rath's avatar
Willi Rath committed
66

Willi Rath's avatar
Willi Rath committed
67 68 69
At this point, there only is a minimal Python environment called `base` that contains `conda` and JupyterLab.  The following explains how to add a Python environment that can be used for scientific analyses.

Use `conda` to create an environment (called `py3_std` in this example, and containing Python 3, `numpy`, `matplotlib`, `scipy`):
70
```bash
Willi Rath's avatar
Willi Rath committed
71
conda create -n py3_std python=3 numpy matplotlib scipy ipykernel
72
```
Willi Rath's avatar
Willi Rath committed
73 74 75 76 77 78 79 80 81 82

Make sure to _**always** install `ipykernel` into the environment_, to make sure that JupyterLab recognizes the new environment as a kernel.)

An opinionated recommendation for a more complete computing environment [can be found in the appendix](#appendix-recommended-environment).


### Use the environment in JupyterLab

Above, we created a new environment `py3_std`.  To use it, activate the `base`
environment and start JupyterLab:
Willi Rath's avatar
Willi Rath committed
83
```bash
Willi Rath's avatar
Willi Rath committed
84
conda activate base
85
jupyter lab --no-browser --ip 127.0.0.1
Willi Rath's avatar
Willi Rath committed
86 87
```

88
If you installed Python to the computer you're sitting in front of, this will automatically start a browser and connect to JupyterLab.  If this does not happen, copy the URL given at the end of the output of the above commands into your browser:
Willi Rath's avatar
Willi Rath committed
89 90 91
```
Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:
92
        http://127.0.0.1:8888/?token=XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Willi Rath's avatar
Willi Rath committed
93 94

```
95

96

Willi Rath's avatar
Willi Rath committed
97
In JupyterLab, you should be able to create a new notebook and choose the `py3_std` environment as its kernel.
98

99

Willi Rath's avatar
Willi Rath committed
100
## Connect to JupyterLab running on a remote machine
Willi Rath's avatar
Willi Rath committed
101

Willi Rath's avatar
Willi Rath committed
102
At this point, you know how to set up a full Python installation including a JupyterLab front end in the `base` environment and one or more scientific computing environments that can be used in real scientific analyses. The instructions above work on any local or remote  Linux / Unix or MacOSX machine.  Below, you'll learn how to connect to JupyterLab if you installed Python on a remote machine that you can only connect to via SSH.
Willi Rath's avatar
Willi Rath committed
103 104


105
### Start JupyterLab on a remote machine
Willi Rath's avatar
Willi Rath committed
106

Willi Rath's avatar
Willi Rath committed
107
After installing Jupyter and your Python environments on a remote machine, follow the instructions on how to [Use the environment in JupyterLab](#use-the-environment-in-jupyterlab) above. But before pasting the URL provided by Jupyter into your browser, make sure to follow the next steps.
Willi Rath's avatar
Willi Rath committed
108 109 110 111 112 113 114 115 116 117

### The essential steps

To let your browser see the network like it looks from a remote machine
(called `host.example.com` here), run the following:
```bash
ssh -f -D localhost:54321 user@host.example.com sleep 15
chromium-browser --proxy-server="socks5://localhost:54321"
```

Willi Rath's avatar
Willi Rath committed
118
This will open an SSH session that provides a local network socket listening on port `54321` that tunnels all traffic through `host.example.com` and then tell chromium to use this socket as a proxy server.  _(Note that instead of `54321`, we could have used any free non-privileged (number between 1024 and 65535) port.)_
Willi Rath's avatar
Willi Rath committed
119 120 121

### Wrapped in a script

Willi Rath's avatar
Willi Rath committed
122 123
The following provides a script that bundles this and handles a few caveats like isolating the tunneled browser session from your other activities on the internet.

Willi Rath's avatar
Willi Rath committed
124 125 126 127 128 129 130 131
Download the script (only needed once):
```bash
curl https://git.geomar.de/python/jupyter_on_HPC_setup_guide/raw/master/scripts/run_chromium_through_ssh_tunnel.sh -O
chmod a+x run_chromium_through_ssh_tunnel.sh
```

And run it to connect to `host.example.com`:
```bash
Willi Rath's avatar
Willi Rath committed
132
./run_chromium_through_ssh_tunnel.sh user@host.example.com <URL-from-JupyterLab>
Willi Rath's avatar
Willi Rath committed
133 134
```

Willi Rath's avatar
Willi Rath committed
135
Note that these steps will only work, if `host.example.com` (or whatever the real address of the remote computer) is a known host in `.ssh/known_hosts`. (To test this, simply create an SSH connection to `host.example.com`, if it is not yet a known host, you will be asked whether to continue and if you do, `host.example.com` will be added to `.ssh/known_hosts`.)
136 137


138 139 140 141 142 143 144
### Wrapped in a script on Windows

First, make sure to have `Git bash` installed.  You can obtain it by installing <https://gitforwindows.org/>.  (Note that you don't really need Git for all the following, but we'll use it to get a fully functional Bash on Windows.)

Then, follow the steps above but replace `run_chromium_through_ssh_tunnel.sh` by `run_chromium_through_ssh_tunnel_WINDOWS.sh`.


145 146 147 148 149 150
## Start JupyterLab on a compute node of an HPC centre

In [job-scripts/](job-scripts/), there's example scripts showing how to start a job that runs JupyterLab on a compute node.

### On Nesh

Willi Rath's avatar
Willi Rath committed
151
With [`nesh-linux-cluster-jupyterlab.sh`](job-scripts/nesh-linux-cluster-jupyterlab.sh), you can submit a job as follows:
152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185
```shell
qsub nesh-linux-cluster-jupyterlab.sh \
    -l elapstim_req=<hh:mm:ss> \
    -b <node-no> \
    -l cpunum_job=<cpu-no> \
    -l memsz_job=<mem-size> \
    -q <batch-class>
```

Checking the status of the job and retrieving the URL on which to connect to JupyterLab is done with:
```shell
bash nesh-linux-cluster-jupyterlab.sh <jobid>
```

### At HLRN Berlin

With [`hlrn-goettingen-jupyterlab.sh`](job-scripts/hlrn-goettingen-jupyterlab.sh), you can submit a job with
```shell
sbatch -t <hh:mm:ss> hlrn-goettingen-jupyterlab.sh
```
where `<hh:mm:ss>` specifies the desired walltime.

Checking the status of the job and retrieving the URL on which to connect to JupyterLab is done with:
```shell
bash hlrn-goettingen-jupyterlab.sh <jobid>
```

You can also just run
```shell
bash hlrn-goettingen-jupyterlab.sh
```
to start jupyterlab where you are (e.g. on a login node).


186 187
## Appendix: Trouble Shooting

Willi Rath's avatar
Willi Rath committed
188
- If you get `ImportError: [...]: version GLIBC_[...] not found` errors on relatively old machines, make sure to priorize the `defaults` channel over `conda-forge`.  This is done by specifying `-c defaults` **before** `-c conda-forge` in `conda install` or `conda create` commands.
189

Willi Rath's avatar
Willi Rath committed
190
- If your installation fails, check <https://conda.io/docs/user-guide/install/#regular-installation>, make sure to follow the instructions for Miniconda**3** and do _**not**_ make this installation your standard Python.
191 192


Willi Rath's avatar
Willi Rath committed
193 194 195 196 197
## Appendix: Jargon

- `conda` is the package manager of the [Anaconda Python
  distribution](https://en.wikipedia.org/wiki/Anaconda_(Python_distribution))

Willi Rath's avatar
Willi Rath committed
198 199 200 201
- [conda environments](https://conda.io/docs/user-guide/tasks/manage-environments.html)
  are collections of packages / versions that make it possible to switch
  between different sets of software.

Willi Rath's avatar
Willi Rath committed
202 203
- [JupyterLab](https://blog.jupyter.org/jupyterlab-is-ready-for-users-5a6f039b8906) is
  an integrated web-based computing environment
204 205


Willi Rath's avatar
Willi Rath committed
206 207 208 209 210 211 212 213 214 215 216 217 218
## Appendix: Recommended environment

```bash
conda create -n py3_std -c conda-forge python=3 \
    aospy basemap basemap-data-hires cartopy cdo cf_units \
    cftime cmocean dask distributed eofs fortran-magic \
    git git-lfs gsw haversine hdf5 ipython iris iris-sample-data \
    jupyter line_profiler matplotlib nco netCDF4 numpy pandas \
    pytables python-cdo scikit-learn scipy seaborn seawater \
    xarray zarr ipykernel
```


219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238
## Appendix: Installation process if you're behind a firewall

### What do you need to be able to access?

To download Miniconda, and to install packages, you will need access to at least the following urls:

- `repo.anaconda.com`
- `conda.anaconda.org`
- `files.pythonhosted.org`
- `repo.continuum.io`

If you want to install from Git repositories, you'd also need access to

- `github.com`
- `gitlab.com`
- `git.geomar.de`
- or wherever the repository from which you want to install lives.

### How to check if I'm blocked

Willi Rath's avatar
Willi Rath committed
239
You can use netcat to see if you can connect to a given port on a given host.  To, e.g., tell if you can connect to `repo.anaconda.com` via HTTPS (port `443`), run
Willi Rath's avatar
Willi Rath committed
240
```shell
Willi Rath's avatar
Willi Rath committed
241
nc -v -z repo.anaconda.com 443
Willi Rath's avatar
Willi Rath committed
242
```
243
which should return something like
Willi Rath's avatar
Willi Rath committed
244
```
Willi Rath's avatar
Willi Rath committed
245
Connection to repo.anaconda.com 443 port [tcp/https] succeeded!
Willi Rath's avatar
Willi Rath committed
246
```
247 248 249 250 251

### How to get access?

- Talk to the administrators of the system you want to install python on.  (Consider showing them this section of the README?)
- Check if you can list the required target URLs insome form (as is the case at HLRN).