You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have the following TODO in the date matching function in src/utils/portals-common/portals-common.js:
// TODO: Coalesce adjacent histories from a station. NB: tricky.
// If we don't do this, and we use strict date matching, then a station with
// several histories fully covering an interval will not be selected, even
// though it should be. The question of what "adjacent" means is a bit tricky
// ... would depend in part on history.freq to distinguish too-large gaps,
// but we already know that attribute isn't always an accurate reflection of
// actual observations.
In a PR, the following discussion ensued:
We don't care about gaps. The users of the data necessarily has to assume that there will be gaps... that's just the nature of weather data collection.
That's interesting. First, let's eliminate what I'm calling "strict" date matching, which is the condition min_obs_time < uiStartDate < uiEndDate < max_obs_time, plus allowance for nils
meaning complete containment in the observation interval. This is currently not used.
Currently, in this app, legacy PDP matching is used, which is the looser condition min_obs_time < uiEndDate && uiStartDate < max_obs_time, plus allowance for nils
meaning overlap with the observation interval
Currently, for both legacy PDP and this app, date matching is done separately for each history in a station. If this isn't right, it can be adjusted in this app.
What I meant about gaps is this: Consider a station with multiple, say 2, histories spanning dates A to B (hx1) and C to D (hx2).
Suppose the user specifies start and end dates s, e such that A < s < B < C < e < D. The legacy matching rule will operate as follows:
hx1: A < e && s < B === true: match
hx2 C < e && s < D === true: match
That's as desired.
But when we have, say, A < B < s < e < C < D then:
hx1: A < e && s < B === false: no match
hx2 C < e && s < D === false: no match
When start or end date fall into the gap between the histories' intervals, the matching fails. Even one of them in the gap means that one of the two histories will not match.
If we decide that the two histories really form one contiguous period, then this matching rule is incorrect. It will bite us if there are large gaps (C - B) into which a start or end date could easily fall. In that case, we'd want a test that used only A to D as the interval, and matched both histories on that basis. That's not hard, but it's different than we have now.
Also, it gets more complicated if we decide that a "large" gap (C - B) means something different than a small gap.
Also, maybe there's a complication with multiple histories that overlap: A < C < B < D or the like.
Questions:
Does the gap problem matter?
Legacy PDP's effective definition of "station" is "history", and effectively ignores meta_station records. SDP treats history records linked by a common station as related.
One further note: Because they are drawn from station_obs_stats_mv, which is updated directly from observations, min_obs_time and max_obs_time are never null (oops: except if there are no observations), and therefore the checking for nulls in the matching rules can (maybe) be simplified accordingly.
But they are not like edate, in which null means "ongoing". Ongoing or not, these values are non-null except when there are no observations at all for that history.
I have the following TODO in the date matching function in
src/utils/portals-common/portals-common.js
:In a PR, the following discussion ensued:
We don't care about gaps. The users of the data necessarily has to assume that there will be gaps... that's just the nature of weather data collection.
Originally posted by @jameshiebert in #83 (comment)
That's interesting. First, let's eliminate what I'm calling "strict" date matching, which is the condition
min_obs_time < uiStartDate < uiEndDate < max_obs_time
, plus allowance for nilsmeaning complete containment in the observation interval. This is currently not used.
Currently, in this app, legacy PDP matching is used, which is the looser condition
min_obs_time < uiEndDate && uiStartDate < max_obs_time
, plus allowance for nilsmeaning overlap with the observation interval
Currently, for both legacy PDP and this app, date matching is done separately for each history in a station. If this isn't right, it can be adjusted in this app.
Originally posted by @rod-glover in #83 (comment)
What I meant about gaps is this: Consider a station with multiple, say 2, histories spanning dates
A
toB
(hx1) andC
toD
(hx2).Suppose the user specifies start and end dates
s
,e
such thatA < s < B < C < e < D
. The legacy matching rule will operate as follows:A < e && s < B === true
: matchC < e && s < D === true
: matchThat's as desired.
But when we have, say,
A < B < s < e < C < D
then:A < e && s < B === false
: no matchC < e && s < D === false
: no matchWhen start or end date fall into the gap between the histories' intervals, the matching fails. Even one of them in the gap means that one of the two histories will not match.
If we decide that the two histories really form one contiguous period, then this matching rule is incorrect. It will bite us if there are large gaps (
C - B
) into which a start or end date could easily fall. In that case, we'd want a test that used onlyA
toD
as the interval, and matched both histories on that basis. That's not hard, but it's different than we have now.Also, it gets more complicated if we decide that a "large" gap (
C - B
) means something different than a small gap.Also, maybe there's a complication with multiple histories that overlap:
A < C < B < D
or the like.Questions:
meta_station
records. SDP treats history records linked by a common station as related.Originally posted by @rod-glover in #83 (comment)
One further note: Because they are drawn from
station_obs_stats_mv
, which is updated directly from observations,min_obs_time
andmax_obs_time
are never null (oops: except if there are no observations), and therefore the checking for nulls in the matching rules can (maybe) be simplified accordingly.But they are not like
edate
, in which null means "ongoing". Ongoing or not, these values are non-null except when there are no observations at all for that history.Originally posted by @rod-glover in #83 (comment)
The text was updated successfully, but these errors were encountered: