Time Tiling
In order to visualize the temporal dimension correctly, we need to implement some way to select and cache time ranges of sources.
Currently time works like this: Each source defines a timerange and a list of timepoint at wich it can be sampled.
Start time End Time
| | | | | |
1 2 3 4 5 6
9:00 10:00 11:00 12:00 13:00 14:00
When a range is selected in the client, say 9:30
to 11:30
, for each active source, this list of timepoints is filtered, so that only timepoints which lie in the current range are active (in this case 10:00
and 11:00
). Their indices are looked up (2
and 3
) and used in the tpage
-field of the request.
For tile layers, only the page closest to the center of the timerange is selected.
The way timepoints are distributed varies: for tile layers, each timepoint corresponds to an entry in the file (i.e. if a netcdf file is loaded, each entry into the time-dimension gets a timepoint).
For point layers, it would not be useful to emit a timepoint for each point in the layer. Instead, points are grouped into sets of up to around 50000 points (RECOMMENDED_PAGE_SIZE
). This is why the request field is called tpage: It implements pagination by time.
This approach has several problems:
- It arguably returns the wrong result for tile layers. If you select the whole timerange, you would expect to see an average instead of the center
- Deciding which points to group together takes a lot of time, especially for point layers which can not be completely loaded into memory (f.ex. SOCAT as an SQLite-Layer).
- For layers with a lot of timepoints, the definition gets really large (several megabytes for WMS layers)
- Whilst we have a level-of-detail-system for spatial definitions, we can't preload a less defined "time-tile", in cases where that would be useful, i.e. animations
This means we need a new approach.
One option would be to just use time-ranges always in requests. If each request delivered its exact temporal extent, sources could "just" deliver the correct points or tile for that range. But it would make caching relatively hard, since if the extent differed by even a second, a completely new request would need to be done, even if the rest overlapped with the previous request.
This problem is well known with systems like WMS, where users can request arbitrary regions and so caching is basically impossible. Not being able to cache means a large increase in the needed compute power and also response time.
A better solution is to divide the temporal dimension into a binary tree.
One possible way to implement this tree would be to just divide it by layer extents.
Consider our example from above:
Start time End Time
| | | | | |
1 2 3 4 5 6
9:00 10:00 11:00 12:00 13:00 14:00
|<------------------A------------------>|
|<--------B-------->|<--------C-------->|
|<---D--->|<---E--->|<---F--->|<---G--->|
|<-H>|<I->|<J->|<K->|<L->|<M->|<N->|<O->|
This way we would need 4 layers of time division to represent our layer to the required accuracy.
If the whole timerange were to be requested, the client would request timerange A
.
If a selection were to be made from 9:00
to just before 13:00
, the client would request ranges B
and F
.
Had a previous selection already used the range B
, it could just be loaded from cache.
The request infrastructure could also use this tiling mechanism to build weighted averages for use with tile sources, i.e. use the cached average from B
and combine it with a newly-computed average from F
.
For point sources, it would just add up the points.
This mechanism would also work well together with a spatial level-of detail system.
One thing that would need to be investigated is the timerange over which the tiles are to be built: should each source have their own time tiles, should all sources in one running instance share the same time tiles (needs #203 (closed)), should time tiles be built over the generally representable time (i64 milliseconds with 0 being 1970-01-01T00:00:00.000Z)?
Further research required.