-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "localize" and "date_range_for_midnight_range" fns #178
Conversation
This file was previously an inconsistent mix of classes targetting whole classes (of methods) vs a single method of a class. Nesting test classes for specific methods under test is simplest - avoids the need to name test classes to represent class + method together.
62c8ea1
to
c628150
Compare
@@ -318,6 +318,34 @@ def test_range_difference_and_intersection_form_partition( | |||
# Ignore types here as structuring this to appease mypy would make it v ugly. | |||
assert (a_difference | intersection | b_difference) == (a | b) # type: ignore[operator] | |||
|
|||
class TestCopy: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧀 Moved up / renamed from standalone test class below 👇.
ValueError: | ||
If one or both boundaries are naive (no timezone). | ||
""" | ||
if not self.start.tzinfo or not self.end.tzinfo: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧀 astimezone
will assume the system timezone by default 1, which may or may not be correct - best to error, which is consistent with the behaviour for localtime
/ Django.
Footnotes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is looking ok - a few questions left about timezones in general - will be happy to approve once comments are addressed.
@@ -964,6 +965,34 @@ def test_union_of_overlapping_ranges(self): | |||
end=datetime.datetime(2000, 1, 4), | |||
) | |||
|
|||
class TestLocalize: | |||
def test_converts_timezone(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add a test here over DST boundaries? Specifically what happens when you have an hour range (in utc) and the start localises to a different tz-offset, and a separate test when end localises to a different tz-offset.
I'm not confident astimezone
will do the right thing - whether that correctly adjusts the offset or not. We dealt with a similar issue of DateTimeRangeField (that would automatically localise) and it would result in empty ranges, because the post-localised value included a fold=1
attribute indicating which side of the boundary it should be on, but the start and end values themselves had the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spot! I've added a couple of tests to try and illustrate this (WIP commit). My assumption is that we should just error here, as the period of time just "doesn't exist" in the target timezone.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-blocking/discussion
Yeah I don't think there's anything else we can really do without getting pretty deep into datetime/tz handling logic. Happy with the exception, nothing else to do here.
Further elaboration non-blocking
Localising a datetime, natively, does not work well with ranges, because much of the standard library does not account for canonicalisation/folding/offsets, and we're relying on the standard library here.
We could try and re-interpret any fold
attribute to canonicalise the datetime, but I don't think a range module should be trying to do that.
But this is kind of why I'm uncertain about building in localization
into this module in the first place. Perhaps it really is better to leave that to outside callers, so they can handle that complexity? Again I'm just musing, and am not requesting changes, but this is where my thought process is on adding functionality that isn't inherent to the concept of range.
We already have footguns though.. simply round-tripping a non-canonical datetime from local_tz to utc back to local_tz can produce errors:
In [61]: SYD = ZoneInfo("Australia/Sydney")
In [62]: dst_forward = datetime(2024, 10, 6, 2, tzinfo=SYD)
In [63]: (localtime.as_localtime(localtime.as_localtime(dst_forward, tz=utc), tz=SYD)-dst_forward).total_seconds()
Out[63]: 3600.0
^ that should be 0 seconds - but the timedelta module doesn't consider offsets important 🙃 and even our seconds
and days
attributes on FiniteDatetimeRange are going to be wrong with non-utc based start and ends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very interesting stuff!
Localising a datetime, natively, does not work well with ranges
In what way?
Some periods of time just "don't exist" in other timezones - the errors reflect that.
that should be 0 seconds - but the timedelta module doesn't consider offsets important
I'm not sure this example is realistic: 2AM doesn't actually exist at that point in time.
Arguably datetime
itself should error on creation if the point in time doesn't exist:
...instances that fall in the spring-forward gap. Such instances don’t represent any valid time
But this isn't just Garbage In, Garbage Out, as timedelta
does emit invalid datetimes:
datetime(2024, 10, 6, 2, tzinfo=SYD) + timedelta(hours=1)
=> datetime.datetime(2024, 10, 6, 2, 0, tzinfo=...) # <== WRONG
Overall this is a problem with datetime
itself. If it weren't a problem, I don't think we
would be having this discussion, or any qualms about this function.
🐮 For now I've done a PR to document this pitfall: #180
🐮 Maybe we should wrap datetime
so it actually behaves correctly 🐉 💥.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I've fixed up the new tests into the second commit, which adds the function under test.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this is a problem with datetime itself.
Yes! It doesn't handle timezones or localisation well at all.
If it weren't a problem, I don't think we would be having this discussion, or any qualms about this function.
Should dict[Any, datetime]
support localize(tz=...)
?
Maybe we should wrap datetime
That's why libraries like arrow and dateutil and others exist, because the std library datetime module is a mess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more thing - the other side of DST where time goes backwards is similarly bad, and isn't an example of garbage in.
In [36]: localtime.as_localtime(datetime(2024, 4, 6, 17, tzinfo=UTC), tz=SYD) - localtime.as_localtime(datetime(2024, 4, 6, 16, tzinfo=UTC), tz=SYD)
Out[36]: datetime.timedelta(seconds=3600)
In [37]: localtime.as_localtime(datetime(2024, 4, 6, 17, tzinfo=UTC), tz=SYD) - localtime.as_localtime(datetime(2024, 4, 6, 15, tzinfo=UTC), tz=SYD)
Out[37]: datetime.timedelta(seconds=3600)
Here, we see two 2 hours of UTC being localised to 1 hour of duration, because resolving it to "SYD" is the 2am after the wind-back, and hence the fold attribute is set:
In [39]: localtime.as_localtime(datetime(2024, 4, 6, 16, tzinfo=UTC), tz=SYD)
Out[39]: datetime.datetime(2024, 4, 7, 2, 0, fold=1, tzinfo=zoneinfo.ZoneInfo(key='Australia/Sydney'))
Timezone names aren't helpful for comparing durations over DST boundaries. It's all.. annoying and hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should dict[Any, datetime] support localize(tz=...)?
Well, no, but I'd say that's more because dict
is generic, where FiniteDatetimeRange
is specific.
the other side of DST where time goes backwards is similarly bad
Hmm...thought provoking. To make the localization explicit:
- 2024-04-06 15:00 UTC => 2024-04-07 02:00 SYD
- 2024-04-06 16:00 UTC => 2024-04-07 02:00 SYD
- 2024-04-06 17:00 UTC => 2024-04-07 03:00 SYD
Here, we see two 2 hours of UTC being localised to 1 hour of duration
Is it wrong though? SYD is losing an hour after all, so that has to have an effect somewhere.
c628150
to
acc415a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more blocking change - referencing localtime
in day_before
- once that's resolved I'm happy to approve.
@@ -964,6 +965,34 @@ def test_union_of_overlapping_ranges(self): | |||
end=datetime.datetime(2000, 1, 4), | |||
) | |||
|
|||
class TestLocalize: | |||
def test_converts_timezone(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-blocking/discussion
Yeah I don't think there's anything else we can really do without getting pretty deep into datetime/tz handling logic. Happy with the exception, nothing else to do here.
Further elaboration non-blocking
Localising a datetime, natively, does not work well with ranges, because much of the standard library does not account for canonicalisation/folding/offsets, and we're relying on the standard library here.
We could try and re-interpret any fold
attribute to canonicalise the datetime, but I don't think a range module should be trying to do that.
But this is kind of why I'm uncertain about building in localization
into this module in the first place. Perhaps it really is better to leave that to outside callers, so they can handle that complexity? Again I'm just musing, and am not requesting changes, but this is where my thought process is on adding functionality that isn't inherent to the concept of range.
We already have footguns though.. simply round-tripping a non-canonical datetime from local_tz to utc back to local_tz can produce errors:
In [61]: SYD = ZoneInfo("Australia/Sydney")
In [62]: dst_forward = datetime(2024, 10, 6, 2, tzinfo=SYD)
In [63]: (localtime.as_localtime(localtime.as_localtime(dst_forward, tz=utc), tz=SYD)-dst_forward).total_seconds()
Out[63]: 3600.0
^ that should be 0 seconds - but the timedelta module doesn't consider offsets important 🙃 and even our seconds
and days
attributes on FiniteDatetimeRange are going to be wrong with non-utc based start and ends.
This works towards DRYing-up converting ranges to date ranges in the next commits - localizing helps ensure a range is aligned to midnight, even if it was stored/retrieved as e.g. UTC.
This will be used to DRY-up converting datetime ranges. Before: ranges.FiniteDateRange( midnight_aligned_range.start.date(), midnight_aligned_range.end.date() - timedelta(days=1), ) After: ranges.date_range_for_midnight_range(midnight_aligned_range) Pull Request in Kraken Core to demo usage [^1]. [^1]: octoenergy/kraken-core#166268
acc415a
to
c840f92
Compare
In response to [^1]. [^1]: #178 (comment)
In response to [^1]. This is a known issue with datetime [^2]. [^1]: #178 (comment) [^2]: https://peps.python.org/pep-0495/#the-fold-attribute
# Create a range in London over the hour that is "gained" | ||
# when Daylight Savings Time (DST) starts - at 1AM. | ||
# | ||
# Note: this is allowed by datetime but not a realistic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧀 Added comment that this test isn't very realistic.
In response to [^1]. This is a known issue with datetime [^2]. [^1]: #178 (comment) [^2]: https://peps.python.org/pep-0495/#the-fold-attribute
@jarshwah thanks again for all the feedback 😃. I've replied to your last comment, and also pushed a commit to bump the version - sorry, forgot to do that as part of the last feedback cycle. I've put it in this PR, and deferred merging, just in case that outstanding thread changes your mind. |
Good to merge @benthorner - though we usually only need a version bump immediately if there’s a major version shift (to ensure there aren’t releases of other minor changes). There’s not a lot of activity on this package though so it’s fine to bump now. |
In response to [^1]. [^1]: #178 (comment)
In response to [^1]. [^1]: #178 (comment)
In response to [^1]. [^1]: #178 (comment)
In response to [^1]. [^1]: #178 (comment)
In response to [^1]. [^1]: #178 (comment)
In response to [^1]. [^1]: #178 (comment)
Follow-up to 1.
Together these can be used to DRY-up converting datetime ranges to
date ranges, irrespective of how the datetime range has been stored.
Before:
After:
Note: in the example the datetime range should be midnight-aligned
in the target timezone, which may or may not be its current timezone.
Pull Request in Kraken Core to demo usage 2.
Footnotes
https://github.com/octoenergy/kraken-core/pull/166008#discussion_r1842744663 ↩
https://github.com/octoenergy/kraken-core/pull/166268 ↩