python issueshttps://git.geomar.de/groups/python/-/issues2022-11-22T09:57:32Zhttps://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/49chrome doesn't find jupyter notebook http address on the HPC2022-11-22T09:57:32ZLavinia Patarachrome doesn't find jupyter notebook http address on the HPCI am trying to run a jupyter notebook on HLRN with local tunneling on a mac. The jupyter notebook runs without problem on hlrn, and the tunneling script opens chrome. The problem is that when I put the http address corresponding to the j...I am trying to run a jupyter notebook on HLRN with local tunneling on a mac. The jupyter notebook runs without problem on hlrn, and the tunneling script opens chrome. The problem is that when I put the http address corresponding to the jupyer notebook, an error occurs ("Address not found"). It occurred to me that when I launch the jupyter script on a compute node (using sbatch) I don't know how to check in which partition it is running. Would anyone know what the problem is? Thanks.https://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/46Make sure to mention that conda commands do not work if .bashrc is not sourced2021-02-19T16:25:46ZKatharina HöflichMake sure to mention that conda commands do not work if .bashrc is not sourcedIf `.bashrc` is not sourced, either because there is no e.g. entry in `.bash_profile` (or because that file is entirely missing!) then `conda` commands do not work as expected, even though `conda` was initialized during installation. Let...If `.bashrc` is not sourced, either because there is no e.g. entry in `.bash_profile` (or because that file is entirely missing!) then `conda` commands do not work as expected, even though `conda` was initialized during installation. Let's explicitly point this out in the troubleshooting section.https://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/45HLRN: How to use all cpus on the single-tenant queues?2021-02-16T15:19:43ZWilli RathHLRN: How to use all cpus on the single-tenant queues?```
salloc --ntasks=1 -p standard96 -A $USER srun --pty bash -i
```
resulted in Dask only seeing 2 cpus but all ~350GB of memory available on the `standard96` nodes. Do we really need to specify `--cpus-per-task=96` to use them all, or i...```
salloc --ntasks=1 -p standard96 -A $USER srun --pty bash -i
```
resulted in Dask only seeing 2 cpus but all ~350GB of memory available on the `standard96` nodes. Do we really need to specify `--cpus-per-task=96` to use them all, or is this a configuration issue at HLRN.https://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/44No connection to nesh with Mac OS X Catalina2021-01-28T15:19:51ZPatricia HandmannNo connection to nesh with Mac OS X CatalinaEventhough I am running the ./run_chromium_through_ssh_tunnel.sh myname6@nesh-fe1.rz.uni-kiel.de
nesh apparently refuses to connect :
Die Website ist nicht erreichbar
127.0.0.1 hat die Verbindung abgelehnt.
Versuchen Sie Folgendes:
Ve...Eventhough I am running the ./run_chromium_through_ssh_tunnel.sh myname6@nesh-fe1.rz.uni-kiel.de
nesh apparently refuses to connect :
Die Website ist nicht erreichbar
127.0.0.1 hat die Verbindung abgelehnt.
Versuchen Sie Folgendes:
Verbindung prüfen
Proxy und Firewall prüfen
ERR_CONNECTION_REFUSED
I hope you can help me.
Cheers,
Patriciahttps://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/43Idle/Orphaned Jupyter processes that block memory resources2020-09-17T09:10:31ZKatharina HöflichIdle/Orphaned Jupyter processes that block memory resourcesThis happens on the Nesh login nodes (as e.g. reported in March by the Nesh admins) and has started to appear also on the `scalc*` machines (the idle Jupyter processes that were especially present on `scalc01` were finally shut down with...This happens on the Nesh login nodes (as e.g. reported in March by the Nesh admins) and has started to appear also on the `scalc*` machines (the idle Jupyter processes that were especially present on `scalc01` were finally shut down with the reboot during yesterday's patch day, though).
I did a bit of reading/experimenting and think that a robust automatic (forced) shutdown of (forgotten) Jupyter processes could be achieved with a combination of the Linux `timeout` command and the built-in "culling features" of Jupyter.
In a scripted approach it might look like this
```
# in seconds
JUPYTER_HARD_TIMEOUT=300
JUPYTER_SOFT_TIMEOUT=60
JUPYTER_CULL_INTERVAL=10
JUPYTER_CULL_TIMEOUT=30
timeout $JUPYTER_HARD_TIMEOUT \
jupyter lab \
--LabApp.shutdown_no_activity_timeout=$JUPYTER_SOFT_TIMEOUT \
--MappingKernelManager.cull_connected=True \
--MappingKernelManager.cull_interval=$JUPYTER_CULL_INTERVAL \
--MappingKernelManager.cull_idle_timeout=$JUPYTER_CULL_TIMEOUT \
--TerminalManager.cull_interval=$JUPYTER_CULL_INTERVAL \
--TerminalManager.cull_inactive_timeout=$JUPYTER_CULL_TIMEOUT \
--ip=$(hostname) --no-browser
```
which implements both a soft and a hard time limit. The built-in shutdown mechanism for idle Jupyter kernels and terminals seems to be very robust and works as I expect them (from reading the docs). The hard limit, on the other hand, I found necessary because I experienced unpredictable behaviour (at least for me) for the `shutdown_no_activity_timeout` option. Sometimes, an inactive JupyterLab open in the browser was shutting downand sometimes it was still there after several minutes (without any kernels and terminals running, or manual activity during this period). It seems to always work if the browser tab is closed, though.
I think, it would be helpful to implement this (or a similar mechanism?) in the scripts we maintain here, i.e. into the `nesh-linux-cluster-jupyterlab.sh`, `hlrn-goettingen-jupyterlab.sh` and into the `remote_jupyter_manager.sh` script? It would also be good to mention the problem of idling Jupyter processes that block memory resources in the README.md.
As I never have worked with a JupyterLab session across several work days I personally would be happy (i.e. not disrupted) with setting up default time limits such as:
```
# in seconds
JUPYTER_HARD_TIMEOUT=64800 # 18 hours
JUPYTER_SOFT_TIMEOUT=43200 # 12 hours
JUPYTER_CULL_INTERVAL=300 # 5 minutes (the Jupyter default, could/should be increased?)
JUPYTER_CULL_TIMEOUT=21600 # 6 hours
```
Any thoughts on this? Should we and how should we implement this? What are (more) useful default time limits?
/cc @willi-rath @sebastian-wahl @martin-claus @klaus-getzlaffhttps://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/42[DOC] Make sure user's don't run into round-robin intermittent problems2020-09-08T10:23:57ZWilli Rath[DOC] Make sure user's don't run into round-robin intermittent problemsI've seen people using round-robin host names (like `nesh-fe....`, or `glogin.hlrn.de`, etc.) and fail intermittently if trying to connect to services listening on `localhost`, because they end up with the tunnel being to a different hos...I've seen people using round-robin host names (like `nesh-fe....`, or `glogin.hlrn.de`, etc.) and fail intermittently if trying to connect to services listening on `localhost`, because they end up with the tunnel being to a different host than the service.
We should add this to the documentation / trouble shooting section.https://git.geomar.de/python/xorca_box/-/issues/20xorca_box.getbox - incomprehensible error message: 'y_c' and 'x_c' must have ...2020-09-17T14:04:39ZKristin Burmeisterxorca_box.getbox - incomprehensible error message: 'y_c' and 'x_c' must have same length of dimension, ndim=1 - but both are 1-dHi,
I get an error message when using xorca_box.getbox regarding the dimension of my coordinates. Both x_c and y_c do have only one dimension, however xorca_box.getbox tells me differently (or at least I understand is so):
ValueError: ...Hi,
I get an error message when using xorca_box.getbox regarding the dimension of my coordinates. Both x_c and y_c do have only one dimension, however xorca_box.getbox tells me differently (or at least I understand is so):
ValueError: dimensions ('y_c', 'x_c') must have the same length as the number of data dimensions, ndim=1
See script below.
I am not sure where to start to find the actual error. Any help is aprreciated. Thank you!
[Extract_regions_from_Viking20X.html](/uploads/7bc993139b9f7864dd169762289d853f/Extract_regions_from_Viking20X.html)https://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/41Linux connection problems2020-07-20T10:32:56ZWilli RathLinux connection problemsFrom #39:
> While trying to set up jupyter on scalc01, I stumbled across a similar issue.
>
> I am using a linux machine `OD-NB010LX`. Starting jupyter-lab from the base env on scalc01 works but when I try to set up the tunnel with c...From #39:
> While trying to set up jupyter on scalc01, I stumbled across a similar issue.
>
> I am using a linux machine `OD-NB010LX`. Starting jupyter-lab from the base env on scalc01 works but when I try to set up the tunnel with chromium, I also get
> > This site can’t be reached
> > 127.0.0.1 refused to connect.
> > Try:
> > Checking the connection
> > Checking the proxy and the firewall
> > ERR_CONNECTION_REFUSED
@gabriel-ditzinger: Can you give details on how exactly you start the tunnel?https://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/40Document how to deal with round-robin host names2023-05-24T09:32:27ZWilli RathDocument how to deal with round-robin host namesIn #39 (and a few times before), we see potential problems with JupyterLab running on a host that has been selected from a round robin IP like:
```shell
$ host nesh-fe.rz.uni-kiel.de ...In #39 (and a few times before), we see potential problems with JupyterLab running on a host that has been selected from a round robin IP like:
```shell
$ host nesh-fe.rz.uni-kiel.de
nesh-fe.rz.uni-kiel.de has address 134.245.3.14
nesh-fe.rz.uni-kiel.de has address 134.245.3.13
nesh-fe.rz.uni-kiel.de has address 134.245.3.15
$ host nesh-fe.rz.uni-kiel.de | awk '{print $4}' | xargs -n1 host
13.3.245.134.in-addr.arpa domain name pointer nesh-fe1.rz.uni-kiel.de.
14.3.245.134.in-addr.arpa domain name pointer nesh-fe2.rz.uni-kiel.de.
15.3.245.134.in-addr.arpa domain name pointer nesh-fe3.rz.uni-kiel.de.
```
As this can lead to difficult-to-debug intermittent problems, we should document and give a recommendation for how to deal with this.https://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/39SSH tunnel from Windows2022-07-21T08:00:36ZAnnika ReintgesSSH tunnel from WindowsMy SSH tunnel from Windows is not working. To connect I have now tried MobaXterm, Anaconda, and Windows Power Shell.
I tried the following things:
1) Without the script on MobaXterm:
`ssh -f -D localhost:54321 smomw247@nesh-fe.rz.uni-k...My SSH tunnel from Windows is not working. To connect I have now tried MobaXterm, Anaconda, and Windows Power Shell.
I tried the following things:
1) Without the script on MobaXterm:
`ssh -f -D localhost:54321 smomw247@nesh-fe.rz.uni-kiel.de sleep 15 `
...some waiting ... back to prompt ...
`chrome-browser --proxy-server="socks5://localhost:54321"`
no matter whether using "chromium" oder "chrome" --> "command not found"
2) With the [script](https://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/tree/master#wrapped-in-a-script-on-windows) for Windows on MobaXterm (Git bash is installed)
`./run_chromium_through_ssh_tunnel_WINDOWS.sh smomw247@nesh-fe.rz.uni-kiel.de http://127.0.0.1:8889/?token=29ed9b500bfb49264316558faa963416586781409b70ce92`
Note: I had to set the variable 'browser' manually in this script as MobaXterm adds '/drives' in front of all paths.
A chrome window is opening but with the message "Die Website ist nicht erreichbar 127.0.0.1 hat die Verbindung abgelehnt."
And in der MobaXterm prompt I get:
> [5384:8604:0626/191730.302:ERROR:cache_util_win.cc(21)] Unable to move the cache: Zugriff verweigert (0x5)
> [5384:8604:0626/191730.303:ERROR:cache_util.cc(139)] Unable to move cache folder C:\Users\areintges\AppData\Local\Google\Chrome\User Data\ShaderCache\GPUCache to C:\Users\areintges\AppData\Local\Google\Chrome\User Data\ShaderCache\old_GPUCache_000
> [5384:8604:0626/191730.303:ERROR:disk_cache.cc(184)] Unable to create cache
> [5384:8604:0626/191730.303:ERROR:shader_disk_cache.cc(606)] Shader Cache Creation failed: -2
> Wird in einer aktuellen Browsersitzung ge▒ffnet.
3) Without script in Anaconda
`ssh -f -D localhost:54321 smomw247@nesh-fe.rz.uni-kiel.de sleep 15`
I enter my password, then the session hangs.
4) With the script for Windows on Anaconda and Windows Power Shell:
`bash run_chromium_through_ssh_tunnel_WINDOWS.sh smomw247@nesh-fe.rz.uni-kiel.de http://127.0.0.1:8889/?token=29ed9b500bfb49264316558faa963416586781409b70ce92`
I get this:
> will route traffic through smomw247@nesh-fe.rz.uni-kiel.de
> using port 54321
> run_chromium_through_ssh_tunnel_WINDOWS.sh: ssh: command not found
> run_chromium_through_ssh_tunnel_WINDOWS.sh: tr: command not found
> run_chromium_through_ssh_tunnel_WINDOWS.sh: paste: command not found
> Won't use proxy for any of:
> run_chromium_through_ssh_tunnel_WINDOWS.sh: : command not found
Can anybody help?
My approach no. 2 seemed most promising to me.https://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/36Add hints on how to debug failing JupyterLab instances on compute nodes2020-06-02T13:35:53ZKatharina HöflichAdd hints on how to debug failing JupyterLab instances on compute nodesI think this would be helpful. Should we prepare a readme section on this?I think this would be helpful. Should we prepare a readme section on this?https://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/33Adapt HLRN Jobscript for Berlin?2020-06-03T06:08:38ZWilli RathAdapt HLRN Jobscript for Berlin?From <https://git.geomar.de/python/jupyter_on_HPC_setup_guide/issues/32#note_18872>:
> The [job script](https://git.geomar.de/python/jupyter_on_HPC_setup_guide/blob/master/job-scripts/hlrn-goettingen-jupyterlab.sh) for Goettingen also w...From <https://git.geomar.de/python/jupyter_on_HPC_setup_guide/issues/32#note_18872>:
> The [job script](https://git.geomar.de/python/jupyter_on_HPC_setup_guide/blob/master/job-scripts/hlrn-goettingen-jupyterlab.sh) for Goettingen also works in Berlin. No changes needed, except for queue names in the header.
Should we provide a separate job script for Berlin even if it is only small adaptions?https://git.geomar.de/python/xorca_box/-/issues/19OK to publish this?2020-03-02T11:11:26ZWilli RathOK to publish this?@jan-klaus-rieck Is it OK for you to make this repo public? I'm discussing a software project that aims at NEMO/XGCM data selection with an external collaborator. It would be great to be able to point to your work here.@jan-klaus-rieck Is it OK for you to make this repo public? I'm discussing a software project that aims at NEMO/XGCM data selection with an external collaborator. It would be great to be able to point to your work here.Jan Klaus RieckJan Klaus Rieckhttps://git.geomar.de/python/jupyter_on_HPC_setup_guide/-/issues/30Should we restructure this repository for easier user access to co-existing/c...2019-12-11T12:16:26ZKatharina HöflichShould we restructure this repository for easier user access to co-existing/complementing solutions?For easier navigation of the repository it could help to do a functional grouping of the scripts and I would therefore propose to re-organize the scripts in a folder structure like this...
```
.
└── jupyer_on_HPC_setup_guide/
├── j...For easier navigation of the repository it could help to do a functional grouping of the scripts and I would therefore propose to re-organize the scripts in a folder structure like this...
```
.
└── jupyer_on_HPC_setup_guide/
├── job_scripts/
│ └── ...
├── remote_Jupyter_manager/
│ └── ...
└── SSH_tunneling/
└── ...
```
Another potential issue might be the readme. While "proofreading" for merge request !14 I came to think that it is rather extensive by now and I guess especially for newcomers it might be rather confusing/difficult to quickly access the desired information. Should we do something about this?
@willi-rath @sebastian-wahlhttps://git.geomar.de/python/xorca_brokenline/-/issues/23needs xarray version >= v0.112019-05-09T11:04:33ZKlaus Getzlaffneeds xarray version >= v0.11running xarray_< v0.11 shows error during execution of `example_02.ipyn` calling function `select_section` while `section = section.assign({'ii': ii[:-1]})`
upgrading with `conda install xarray` solves the problemrunning xarray_< v0.11 shows error during execution of `example_02.ipyn` calling function `select_section` while `section = section.assign({'ii': ii[:-1]})`
upgrading with `conda install xarray` solves the problemhttps://git.geomar.de/python/xorca_brokenline/-/issues/22Include high-level function2019-05-02T13:21:53ZPatrick WagnerInclude high-level functionInclude a high level function that takes coordinates and input data as input and returns a section dataset.Include a high level function that takes coordinates and input data as input and returns a section dataset.https://git.geomar.de/python/xorca_box/-/issues/13getbox defines mask on T grid for all variables2019-04-09T09:35:47ZJan Klaus Rieckgetbox defines mask on T grid for all variablesThe question is whether, we want to have accurate extraction of the variables on their own grid and then have differently sized dimensions in the output or we want to have all variables with the same sized dimensions and thereby construc...The question is whether, we want to have accurate extraction of the variables on their own grid and then have differently sized dimensions in the output or we want to have all variables with the same sized dimensions and thereby construct the mask on the T grid and use these indices for all variables....https://git.geomar.de/python/xorca_box/-/issues/12avebox.boxave: spurious dimension z_c added when var has dimension z_l2019-04-12T12:30:11ZJan Klaus Rieckavebox.boxave: spurious dimension z_c added when var has dimension z_lJan Klaus RieckJan Klaus Rieckhttps://git.geomar.de/python/xorca_mockup_nemo_data/-/issues/3add mesh_hgr, mesh_zgr, and mask datasets2019-04-05T16:44:41ZWilli Rathadd mesh_hgr, mesh_zgr, and mask datasetsThese are a litte more tricky, because we need to respect the real topo.
How to generate this:
- random depth field → tmask
- real lat and lon fields → horizontal grid constants
- real vertical grid → vertical grid constantsThese are a litte more tricky, because we need to respect the real topo.
How to generate this:
- random depth field → tmask
- real lat and lon fields → horizontal grid constants
- real vertical grid → vertical grid constantshttps://git.geomar.de/python/xorca_mockup_nemo_data/-/issues/2Add fields2019-04-05T07:50:39ZWilli RathAdd fieldsWe need
- [ ] icemod fields.
- [ ] tracer fields and bio stuffWe need
- [ ] icemod fields.
- [ ] tracer fields and bio stuff