Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add "localize" and "date_range_for_midnight_range" fns #178

Merged
merged 4 commits into from
Dec 11, 2024

Conversation

benthorner
Copy link
Contributor

@benthorner benthorner commented Nov 22, 2024

Follow-up to 1.

Together these can be used to DRY-up converting datetime ranges to
date ranges, irrespective of how the datetime range has been stored.

Before:

ranges.FiniteDateRange(
    localtime.date(midnight_aligned_range.start),
    localtime.date_of_day_before(midnight_aligned_range.end),
)

After:

ranges.date_range_for_midnight_range(
    midnight_aligned_range.localize(timezone.get_current_timezone()),
)

Note: in the example the datetime range should be midnight-aligned
in the target timezone, which may or may not be its current timezone.

Pull Request in Kraken Core to demo usage 2.

Footnotes

  1. https://github.com/octoenergy/kraken-core/pull/166008#discussion_r1842744663

  2. https://github.com/octoenergy/kraken-core/pull/166268

@benthorner benthorner marked this pull request as ready for review November 22, 2024 11:25
This file was previously an inconsistent mix of classes targetting
whole classes (of methods) vs a single method of a class. Nesting
test classes for specific methods under test is simplest - avoids
the need to name test classes to represent class + method together.
@benthorner benthorner force-pushed the ranges/dt-to-date-range branch from 62c8ea1 to c628150 Compare December 4, 2024 16:08
@benthorner benthorner changed the title Add ranges "range_for_midnight_range" utility fns Add "localize" and "date_range_for_midnight_range" fns Dec 4, 2024
@@ -318,6 +318,34 @@ def test_range_difference_and_intersection_form_partition(
# Ignore types here as structuring this to appease mypy would make it v ugly.
assert (a_difference | intersection | b_difference) == (a | b) # type: ignore[operator]

class TestCopy:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧀 Moved up / renamed from standalone test class below 👇.

ValueError:
If one or both boundaries are naive (no timezone).
"""
if not self.start.tzinfo or not self.end.tzinfo:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧀 astimezone will assume the system timezone by default 1, which may or may not be correct - best to error, which is consistent with the behaviour for localtime / Django.

Footnotes

  1. https://docs.python.org/3/library/datetime.html#datetime.datetime.astimezone

@benthorner benthorner requested a review from jarshwah December 4, 2024 16:18
Copy link
Contributor

@jarshwah jarshwah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is looking ok - a few questions left about timezones in general - will be happy to approve once comments are addressed.

tests/test_ranges.py Show resolved Hide resolved
tests/test_ranges.py Show resolved Hide resolved
@@ -964,6 +965,34 @@ def test_union_of_overlapping_ranges(self):
end=datetime.datetime(2000, 1, 4),
)

class TestLocalize:
def test_converts_timezone(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a test here over DST boundaries? Specifically what happens when you have an hour range (in utc) and the start localises to a different tz-offset, and a separate test when end localises to a different tz-offset.

I'm not confident astimezone will do the right thing - whether that correctly adjusts the offset or not. We dealt with a similar issue of DateTimeRangeField (that would automatically localise) and it would result in empty ranges, because the post-localised value included a fold=1 attribute indicating which side of the boundary it should be on, but the start and end values themselves had the same time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot! I've added a couple of tests to try and illustrate this (WIP commit). My assumption is that we should just error here, as the period of time just "doesn't exist" in the target timezone.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking/discussion

Yeah I don't think there's anything else we can really do without getting pretty deep into datetime/tz handling logic. Happy with the exception, nothing else to do here.

Further elaboration non-blocking

Localising a datetime, natively, does not work well with ranges, because much of the standard library does not account for canonicalisation/folding/offsets, and we're relying on the standard library here.

We could try and re-interpret any fold attribute to canonicalise the datetime, but I don't think a range module should be trying to do that.

But this is kind of why I'm uncertain about building in localization into this module in the first place. Perhaps it really is better to leave that to outside callers, so they can handle that complexity? Again I'm just musing, and am not requesting changes, but this is where my thought process is on adding functionality that isn't inherent to the concept of range.

We already have footguns though.. simply round-tripping a non-canonical datetime from local_tz to utc back to local_tz can produce errors:

In [61]: SYD = ZoneInfo("Australia/Sydney")

In [62]: dst_forward = datetime(2024, 10, 6, 2, tzinfo=SYD)

In [63]: (localtime.as_localtime(localtime.as_localtime(dst_forward, tz=utc), tz=SYD)-dst_forward).total_seconds()
Out[63]: 3600.0

^ that should be 0 seconds - but the timedelta module doesn't consider offsets important 🙃 and even our seconds and days attributes on FiniteDatetimeRange are going to be wrong with non-utc based start and ends.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very interesting stuff!

Localising a datetime, natively, does not work well with ranges

In what way?

Some periods of time just "don't exist" in other timezones - the errors reflect that.

that should be 0 seconds - but the timedelta module doesn't consider offsets important

I'm not sure this example is realistic: 2AM doesn't actually exist at that point in time.

Arguably datetime itself should error on creation if the point in time doesn't exist:

...instances that fall in the spring-forward gap. Such instances don’t represent any valid time

But this isn't just Garbage In, Garbage Out, as timedelta does emit invalid datetimes:

datetime(2024, 10, 6, 2, tzinfo=SYD) + timedelta(hours=1)
=> datetime.datetime(2024, 10, 6, 2, 0, tzinfo=...)  # <== WRONG

Overall this is a problem with datetime itself. If it weren't a problem, I don't think we
would be having this discussion, or any qualms about this function.

🐮 For now I've done a PR to document this pitfall: #180

🐮 Maybe we should wrap datetime so it actually behaves correctly 🐉 💥.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I've fixed up the new tests into the second commit, which adds the function under test.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is a problem with datetime itself.

Yes! It doesn't handle timezones or localisation well at all.

If it weren't a problem, I don't think we would be having this discussion, or any qualms about this function.

Should dict[Any, datetime] support localize(tz=...)?

Maybe we should wrap datetime

That's why libraries like arrow and dateutil and others exist, because the std library datetime module is a mess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing - the other side of DST where time goes backwards is similarly bad, and isn't an example of garbage in.

In [36]: localtime.as_localtime(datetime(2024, 4, 6, 17, tzinfo=UTC), tz=SYD) - localtime.as_localtime(datetime(2024, 4, 6, 16, tzinfo=UTC), tz=SYD)
Out[36]: datetime.timedelta(seconds=3600)

In [37]: localtime.as_localtime(datetime(2024, 4, 6, 17, tzinfo=UTC), tz=SYD) - localtime.as_localtime(datetime(2024, 4, 6, 15, tzinfo=UTC), tz=SYD)
Out[37]: datetime.timedelta(seconds=3600)

Here, we see two 2 hours of UTC being localised to 1 hour of duration, because resolving it to "SYD" is the 2am after the wind-back, and hence the fold attribute is set:

In [39]: localtime.as_localtime(datetime(2024, 4, 6, 16, tzinfo=UTC), tz=SYD)
Out[39]: datetime.datetime(2024, 4, 7, 2, 0, fold=1, tzinfo=zoneinfo.ZoneInfo(key='Australia/Sydney'))

Timezone names aren't helpful for comparing durations over DST boundaries. It's all.. annoying and hard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should dict[Any, datetime] support localize(tz=...)?

Well, no, but I'd say that's more because dict is generic, where FiniteDatetimeRange is specific.

the other side of DST where time goes backwards is similarly bad

Hmm...thought provoking. To make the localization explicit:

  • 2024-04-06 15:00 UTC => 2024-04-07 02:00 SYD
  • 2024-04-06 16:00 UTC => 2024-04-07 02:00 SYD
  • 2024-04-06 17:00 UTC => 2024-04-07 03:00 SYD

Here, we see two 2 hours of UTC being localised to 1 hour of duration

Is it wrong though? SYD is losing an hour after all, so that has to have an effect somewhere.

xocto/ranges.py Show resolved Hide resolved
@benthorner benthorner force-pushed the ranges/dt-to-date-range branch from c628150 to acc415a Compare December 5, 2024 15:38
@benthorner benthorner requested a review from jarshwah December 5, 2024 15:40
Copy link
Contributor

@jarshwah jarshwah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more blocking change - referencing localtime in day_before - once that's resolved I'm happy to approve.

xocto/ranges.py Outdated Show resolved Hide resolved
@@ -964,6 +965,34 @@ def test_union_of_overlapping_ranges(self):
end=datetime.datetime(2000, 1, 4),
)

class TestLocalize:
def test_converts_timezone(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking/discussion

Yeah I don't think there's anything else we can really do without getting pretty deep into datetime/tz handling logic. Happy with the exception, nothing else to do here.

Further elaboration non-blocking

Localising a datetime, natively, does not work well with ranges, because much of the standard library does not account for canonicalisation/folding/offsets, and we're relying on the standard library here.

We could try and re-interpret any fold attribute to canonicalise the datetime, but I don't think a range module should be trying to do that.

But this is kind of why I'm uncertain about building in localization into this module in the first place. Perhaps it really is better to leave that to outside callers, so they can handle that complexity? Again I'm just musing, and am not requesting changes, but this is where my thought process is on adding functionality that isn't inherent to the concept of range.

We already have footguns though.. simply round-tripping a non-canonical datetime from local_tz to utc back to local_tz can produce errors:

In [61]: SYD = ZoneInfo("Australia/Sydney")

In [62]: dst_forward = datetime(2024, 10, 6, 2, tzinfo=SYD)

In [63]: (localtime.as_localtime(localtime.as_localtime(dst_forward, tz=utc), tz=SYD)-dst_forward).total_seconds()
Out[63]: 3600.0

^ that should be 0 seconds - but the timedelta module doesn't consider offsets important 🙃 and even our seconds and days attributes on FiniteDatetimeRange are going to be wrong with non-utc based start and ends.

tests/test_ranges.py Show resolved Hide resolved
This works towards DRYing-up converting ranges to date ranges in
the next commits - localizing helps ensure a range is aligned to
midnight, even if it was stored/retrieved as e.g. UTC.
This will be used to DRY-up converting datetime ranges.

Before:

    ranges.FiniteDateRange(
        midnight_aligned_range.start.date(),
        midnight_aligned_range.end.date() - timedelta(days=1),
    )

After:

    ranges.date_range_for_midnight_range(midnight_aligned_range)

Pull Request in Kraken Core to demo usage [^1].

[^1]: octoenergy/kraken-core#166268
@benthorner benthorner force-pushed the ranges/dt-to-date-range branch from acc415a to c840f92 Compare December 9, 2024 14:14
benthorner added a commit that referenced this pull request Dec 9, 2024
benthorner added a commit that referenced this pull request Dec 9, 2024
In response to [^1].

This is a known issue with datetime [^2].

[^1]: #178 (comment)
[^2]: https://peps.python.org/pep-0495/#the-fold-attribute
@benthorner benthorner requested a review from jarshwah December 9, 2024 14:40
# Create a range in London over the hour that is "gained"
# when Daylight Savings Time (DST) starts - at 1AM.
#
# Note: this is allowed by datetime but not a realistic
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧀 Added comment that this test isn't very realistic.

benthorner added a commit that referenced this pull request Dec 9, 2024
In response to [^1].

This is a known issue with datetime [^2].

[^1]: #178 (comment)
[^2]: https://peps.python.org/pep-0495/#the-fold-attribute
@benthorner
Copy link
Contributor Author

@jarshwah thanks again for all the feedback 😃.

I've replied to your last comment, and also pushed a commit to bump the version - sorry, forgot to do that as part of the last feedback cycle. I've put it in this PR, and deferred merging, just in case that outstanding thread changes your mind.

@jarshwah
Copy link
Contributor

Good to merge @benthorner - though we usually only need a version bump immediately if there’s a major version shift (to ensure there aren’t releases of other minor changes). There’s not a lot of activity on this package though so it’s fine to bump now.

@benthorner benthorner merged commit 7f34d7c into main Dec 11, 2024
1 check passed
@benthorner benthorner deleted the ranges/dt-to-date-range branch December 11, 2024 10:16
benthorner added a commit that referenced this pull request Dec 18, 2024
benthorner added a commit that referenced this pull request Dec 18, 2024
benthorner added a commit that referenced this pull request Dec 18, 2024
benthorner added a commit that referenced this pull request Dec 18, 2024
benthorner added a commit that referenced this pull request Dec 20, 2024
benthorner added a commit that referenced this pull request Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants