data_repo_renderer issueshttps://git.geomar.de/data/tools/data_repo_renderer/-/issues2019-01-14T13:25:27Zhttps://git.geomar.de/data/tools/data_repo_renderer/-/issues/74Adjust link to repo_renderer2019-01-14T13:25:27ZGabriel DitzingerAdjust link to repo_rendererClone should use https://git.geomar.de/data/tools/data_repo_renderer.gitClone should use https://git.geomar.de/data/tools/data_repo_renderer.gitGabriel DitzingerGabriel Ditzingerhttps://git.geomar.de/data/tools/data_repo_renderer/-/issues/44Add renderer for submodules2017-08-14T14:52:38ZWilli RathAdd renderer for submodulesCurrently, `git submodule` is used in MIMOC and MIMOC_cf per pre-processing introduced in #12. This is not very nice and does not allow for easily adding submodules to depend on.Currently, `git submodule` is used in MIMOC and MIMOC_cf per pre-processing introduced in #12. This is not very nice and does not allow for easily adding submodules to depend on.Willi RathWilli Rathhttps://git.geomar.de/data/tools/data_repo_renderer/-/issues/2Add regression tests2017-07-13T09:51:10ZWilli RathAdd regression testsCurrently (a9c4c2c91cdb6a8ea4344b7645a545b02a7a9203), test do not cover [missing arguments to the `Renderer` sub classes](https://git.geomar.de/data/data_repo_renderer/blob/a9c4c2c91cdb6a8ea4344b7645a545b02a7a9203/data_repo_renderer/__in...Currently (a9c4c2c91cdb6a8ea4344b7645a545b02a7a9203), test do not cover [missing arguments to the `Renderer` sub classes](https://git.geomar.de/data/data_repo_renderer/blob/a9c4c2c91cdb6a8ea4344b7645a545b02a7a9203/data_repo_renderer/__init__.py#L35) and the [case of a missing version string](https://git.geomar.de/data/data_repo_renderer/blob/a9c4c2c91cdb6a8ea4344b7645a545b02a7a9203/data_repo_renderer/__init__.py#L11).
I don't see a way to cover the Exception for running the renderer without installation.Willi RathWilli Rathhttps://git.geomar.de/data/tools/data_repo_renderer/-/issues/24Compress data without sacrificing benefits from mirroring2017-07-27T19:07:20ZWilli RathCompress data without sacrificing benefits from mirroringMost upstream data repositories use uncompressed legacy formats or do explicitly gzip data files. The former gives away about a factor of 3 in file size. The latter results in reasonable disk use but adds decompression overhead every t...Most upstream data repositories use uncompressed legacy formats or do explicitly gzip data files. The former gives away about a factor of 3 in file size. The latter results in reasonable disk use but adds decompression overhead every time the data is used.
An approach that (at first glance) helps, would be to download the data and then convert it to reasonably sized `netCDF4-classic` files with deflation. This, however, breaks efficient use of, e.g., `wget`s mirroring capabilities which rely on comparing upstream files to those already present on disk.
Another thing to keep in mind: Compression with, e.g., `ncks` (see [TM/TMSoftware/convert_to_deflated_nc4classic_with_small_chunks.sh](https://git.geomar.de/TM/TMSoftware/blob/master/convert_to_deflated_nc4classic_with_small_chunks.sh)) breaks Git hashing by adding a history argument containing date info. Simply keeping one copy for the mirror, then compress, and then track versions won't work without cleaning the netCDF files. (We need content-based hashing ...)