Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gap problem for date matching? #85

Open
rod-glover opened this issue Jan 5, 2022 · 0 comments
Open

Gap problem for date matching? #85

rod-glover opened this issue Jan 5, 2022 · 0 comments

Comments

@rod-glover
Copy link
Contributor

I have the following TODO in the date matching function in src/utils/portals-common/portals-common.js:

// TODO: Coalesce adjacent histories from a station. NB: tricky.
//  If we don't do this, and we use strict date matching, then a station with
//  several histories fully covering an interval will not be selected, even
//  though it should be. The question of what "adjacent" means is a bit tricky
//  ... would depend in part on history.freq to distinguish too-large gaps,
//  but we already know that attribute isn't always an accurate reflection of
//  actual observations.

In a PR, the following discussion ensued:

We don't care about gaps. The users of the data necessarily has to assume that there will be gaps... that's just the nature of weather data collection.

Originally posted by @jameshiebert in #83 (comment)

That's interesting. First, let's eliminate what I'm calling "strict" date matching, which is the condition
min_obs_time < uiStartDate < uiEndDate < max_obs_time, plus allowance for nils
meaning complete containment in the observation interval. This is currently not used.

Currently, in this app, legacy PDP matching is used, which is the looser condition
min_obs_time < uiEndDate && uiStartDate < max_obs_time, plus allowance for nils
meaning overlap with the observation interval

Currently, for both legacy PDP and this app, date matching is done separately for each history in a station. If this isn't right, it can be adjusted in this app.

Originally posted by @rod-glover in #83 (comment)

What I meant about gaps is this: Consider a station with multiple, say 2, histories spanning dates A to B (hx1) and C to D (hx2).

Suppose the user specifies start and end dates s, e such that A < s < B < C < e < D. The legacy matching rule will operate as follows:

  • hx1: A < e && s < B === true: match
  • hx2 C < e && s < D === true: match

That's as desired.

But when we have, say, A < B < s < e < C < D then:

  • hx1: A < e && s < B === false: no match
  • hx2 C < e && s < D === false: no match

When start or end date fall into the gap between the histories' intervals, the matching fails. Even one of them in the gap means that one of the two histories will not match.

If we decide that the two histories really form one contiguous period, then this matching rule is incorrect. It will bite us if there are large gaps (C - B) into which a start or end date could easily fall. In that case, we'd want a test that used only A to D as the interval, and matched both histories on that basis. That's not hard, but it's different than we have now.

Also, it gets more complicated if we decide that a "large" gap (C - B) means something different than a small gap.

Also, maybe there's a complication with multiple histories that overlap: A < C < B < D or the like.

Questions:

  1. Does the gap problem matter?
  2. Legacy PDP's effective definition of "station" is "history", and effectively ignores meta_station records. SDP treats history records linked by a common station as related.

Originally posted by @rod-glover in #83 (comment)

One further note: Because they are drawn from station_obs_stats_mv, which is updated directly from observations, min_obs_time and max_obs_time are never null (oops: except if there are no observations), and therefore the checking for nulls in the matching rules can (maybe) be simplified accordingly.

But they are not like edate, in which null means "ongoing". Ongoing or not, these values are non-null except when there are no observations at all for that history.

Originally posted by @rod-glover in #83 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant