Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Harness support for Pebble notices #1100

Merged
merged 8 commits into from
Jan 9, 2024

Conversation

benhoyt
Copy link
Collaborator

@benhoyt benhoyt commented Jan 4, 2024

This PR adds Harness support for Pebble Notices, specifically a new Harness.pebble_notify method which you call as follows -- this will add a custom notice with key "example.com/key" and some data:

harness.pebble_notify('mycontainer', 'example.com/key', data={'optional': 'data'})

This will record a notice (which will be visible in the Container.get_notice and Container.get_notices return values) and, if the notice is new or repeated, trigger a pebble-custom-notice event like Juju would.

Note that I've based the fake Harness implementation on the Pebble Go code (for example, State.AddNotice and v1GetNotices). There are some tests of pebble.Client.notify and get_notice and get_notices that are run against a real Pebble server, like we do with some of the other Pebble-related functionality.

I've also removed the get_notices "after" param for now, as it's likely not needed in this context and is hard to implement with the Python microseconds vs Go nanoseconds discrepancy. We'll probably have to do this via a string parameter in future (or something other than datetime.datetime, as that only stores microseconds). But I don't think it'll be used here, as Juju does the long-polling where "after" is required.

@benhoyt benhoyt changed the title feat: Harness support for pebble_notify feat: Harness support for Pebble notices Jan 4, 2024
Also remove get_notices() "after" param for now, as it's likely not
needed in this context and is hard to implement with the Python
microseconds vs Go nanoseconds discrepancy.
ops/testing.py Outdated
raise pebble.APIError(
body={}, code=404, status='Not Found',
message=f"stat {path}: no such file or directory")
raise self._api_error(404, f"stat {path}: no such file or directory") from None
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by change: add a helper function to produce these errors.

from .test_testing import PebbleNoticesMixin, PebbleStorageAPIsTestMixin


def get_socket_path() -> str:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, drive-by code consolidation to avoid repeating this logic.

relation: str
container: typing.Optional[str]


Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed as it was getting annoying and I don't think it provided much value.

@benhoyt benhoyt marked this pull request as ready for review January 5, 2024 01:23
@tonyandrewmeyer tonyandrewmeyer self-requested a review January 5, 2024 01:30
ops/pebble.py Outdated Show resolved Hide resolved
ops/pebble.py Show resolved Hide resolved
ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Outdated Show resolved Hide resolved
return notice.id, new_or_repeated

def _api_error(self, code: int, message: str) -> pebble.APIError:
status = http.HTTPStatus(code).phrase
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL: I knew that http in the stdlib had a set of the standard statuses, but I didn't know there are also handy phrases attached. Nice!

ops/testing.py Outdated Show resolved Hide resolved
test/test_testing.py Show resolved Hide resolved
test/test_testing.py Outdated Show resolved Hide resolved
test/test_testing.py Outdated Show resolved Hide resolved
ops/testing.py Show resolved Hide resolved
Copy link
Contributor

@tonyandrewmeyer tonyandrewmeyer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Copy link
Member

@jameinel jameinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looking quite good. Two main gripes are repeat_after not being very obvious and that we don't have a way to be testing uid support for containers that arent running as root.

ops/pebble.py Outdated Show resolved Hide resolved
ops/pebble.py Outdated Show resolved Hide resolved
ops/pebble.py Outdated Show resolved Hide resolved
@@ -2772,14 +2798,18 @@ def get_notices(
user (notices whose ``user_id`` matches the requester UID as well as
public notices).

Note that the "after" filter is not yet implemented, as it's not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rounding up to the ms seems like a reasonable way to still get the behavior you want out of a "give me all the notices since X", but I suppose the concern is that you could have a notice happen 1ns after the threshold and thus wasnt sent the first time and get skipped the second.
But if it was ms, you could definitely have 2 events in the same ms and so the threshold needs to be >=.
That said, even ns isn't that good of a guarantee of never having more than one. (and even if the resolution of the value is ns, your clock may only have ms resolution)
if you make the rule >=, then you get overlapping values, so you have to do some local filtering, but you guarantee to not miss anything.
if you want this to be strictly >, then it needs to be an opaque token that you get that hand back again (and maybe you have a separate helper to convert to a datetime)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For better or worse, this is what's implemented in Pebble (with nanoseconds). However, it is safe if you specify after=<string of last value you saw>, at least assuming that two separate notices can't be recorded in the same nanosecond -- I safe bet for a long time, I suspect. :-)

In any case, this is why I decided to leave this out for now, as it's tricky to get right. We'd have to allow an option to not convert the last-repeated JSON field to a datetime, keep it as a string, and pass that directly through. I can think of a path to get there, but it seems too complex for something we'll probably never use in a charm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that I don't think we'll ever generate 2 notices in the same nanosecond. My concern is more that while the representation supports nanoseconds, I don't think the clock gives that kind of resolution.
(famously the time.clock on windows was actually a 60Hz clock, and you had to use the "high performance" timers to even get ms resolution. Digging for Go it seems you might get 100ns resolution https://stackoverflow.com/questions/14610459/how-precise-is-gos-time-really
but it is platform dependent and on BSD was 1ms for a long time)

Just to say, having an opaque key is good, trusting your clock to give you unique values isnt.
In Pebble you could have a check if the timestamp is already cached you just add 1ns to the value to make it unique and it becomes a sort order.

mgo assumed 1ms resolution for its ObjectId and kept 8 bits for a counter (so up to 256 events per ms could be disambiguated)
Go went Nanoseconds because it makes everything straightforward and maybe we'll have clocks close to that (MS now claims 100ns)
I think designing a system assuming a millisecond clock will keep you robust.

ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Outdated Show resolved Hide resolved
ops/testing.py Show resolved Hide resolved
test/test_testing.py Show resolved Hide resolved
key2 = 'example.com/' + os.urandom(16).hex()
id1 = client.notify(pebble.NoticeType.CUSTOM, key1)
id2 = client.notify(pebble.NoticeType.CUSTOM, key2, data={'x': 'y'})
time.sleep(0.000_001) # Ensure times are different.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is usually a lot better to have ways of injecting a time source rather than sleeping to get time to change.
probably a fair bit of overhead for this test, but eventually sleep based tests bite you on someone else's hardware acting differently than you expected.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I would have patched/injected, but these tests are used against "real Pebble" as well (via the mixin). So we need real time for that (unless we want to inject something in Pebble, but that gets really painful).

Copy link
Member

@jameinel jameinel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some comments about how we might design after but I'm happy with the implementation and you pulled out after for now anyway.

@@ -2744,7 +2744,6 @@ def get_notices(
user_id: Optional[int] = None,
types: Optional[Iterable[Union[pebble.NoticeType, str]]] = None,
keys: Optional[Iterable[str]] = None,
after: Optional[datetime.datetime] = None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it, you could just have some opaque Notice.id and then take after=notice.id. That field could be taken from last_repeated but is intentionally left opaque.

@@ -2772,14 +2798,18 @@ def get_notices(
user (notices whose ``user_id`` matches the requester UID as well as
public notices).

Note that the "after" filter is not yet implemented, as it's not
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that I don't think we'll ever generate 2 notices in the same nanosecond. My concern is more that while the representation supports nanoseconds, I don't think the clock gives that kind of resolution.
(famously the time.clock on windows was actually a 60Hz clock, and you had to use the "high performance" timers to even get ms resolution. Digging for Go it seems you might get 100ns resolution https://stackoverflow.com/questions/14610459/how-precise-is-gos-time-really
but it is platform dependent and on BSD was 1ms for a long time)

Just to say, having an opaque key is good, trusting your clock to give you unique values isnt.
In Pebble you could have a check if the timestamp is already cached you just add 1ns to the value to make it unique and it becomes a sort order.

mgo assumed 1ms resolution for its ObjectId and kept 8 bits for a counter (so up to 256 events per ms could be disambiguated)
Go went Nanoseconds because it makes everything straightforward and maybe we'll have clocks close to that (MS now claims 100ns)
I think designing a system assuming a millisecond clock will keep you robust.

# Update last repeated time if repeat-after time has elapsed (or is None)
last_repeated = now
new_or_repeated = True
notice = dataclasses.replace(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@benhoyt benhoyt merged commit 5d27857 into canonical:main Jan 9, 2024
18 checks passed
@benhoyt benhoyt deleted the harness-notify branch January 9, 2024 21:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants