-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Improve logging in core components #3332
base: main
Are you sure you want to change the base?
Conversation
Not sure why the following Scenario Integration Tests are failing: FAILED examples/anoncreds_issuance_and_revocation::anoncreds_issuance_and_revocation
FAILED examples/presenting_revoked_credential::presenting_revoked_credential
FAILED examples/self_attested::self_attested |
Looks like BDD Interop tests also silently fail: ("silently" because the Github action shows a green tick) Failing scenarios:
features/0036-issue-credential.feature:9 Issue a credential with the Holder beginning with a proposal
features/0036-issue-credential.feature:23 Issue a credential with the Holder beginning with a proposal with DID Exchange Connection
features/0036-issue-credential.feature:53 Issue a credential with the Issuer beginning with an offer
features/0037-present-proof.feature:19 Present Proof where the prover does not propose a presentation of the proof and is acknowledged -- @1.1
features/0037-present-proof.feature:20 Present Proof where the prover does not propose a presentation of the proof and is acknowledged -- @1.2
features/0037-present-proof.feature:37 Present Proof where the prover does not propose a presentation of the proof and is acknowledged with a DID Exchange Connection -- @1.1
features/0037-present-proof.feature:38 Present Proof where the prover does not propose a presentation of the proof and is acknowledged with a DID Exchange Connection -- @1.2
features/0037-present-proof.feature:55 Present Proof of specific types and proof is acknowledged with a Drivers License credential type -- @1.1
features/0037-present-proof.feature:56 Present Proof of specific types and proof is acknowledged with a Drivers License credential type -- @1.2
features/0037-present-proof.feature:74 Present Proof of specific types and proof is acknowledged with a Biological Indicators credential type -- @1.1
features/0037-present-proof.feature:91 Present Proof of specific types and proof is acknowledged with multiple credential types -- @1.1
features/0037-present-proof.feature:109 Present Proof where the prover does not propose a presentation of the proof and is acknowledged -- @1.1
features/0037-present-proof.feature:110 Present Proof where the prover does not propose a presentation of the proof and is acknowledged -- @1.2
features/0037-present-proof.feature:129 Present Proof where the prover has proposed the presentation of proof in response to a presentation request and is acknowledged -- @1.1
features/0037-present-proof.feature:130 Present Proof where the prover has proposed the presentation of proof in response to a presentation request and is acknowledged -- @1.2
features/0037-present-proof.feature:150 Present Proof where the prover has proposed the presentation of proof from a different credential in response to a presentation request and is acknowledged -- @1.1
features/0037-present-proof.feature:151 Present Proof where the prover has proposed the presentation of proof from a different credential in response to a presentation request and is acknowledged -- @1.2
features/0183-revocation.feature:79 Credential revoked and replaced with a new updated credential, holder proves claims with the updated credential with timesstamp -- @1.1
features/0183-revocation.feature:380 Issuer revokes a credential and sends a v1 revocation notification -- @1.1
features/0434-out-of-band.feature:9 Issue a v1 indy credential using connectionless out of band invitation
features/0434-out-of-band.feature:50 Present a v1 indy proof using connectionless out of band invitation
7 features passed, 4 failed, 4 skipped
36 scenarios passed, 21 failed, 103 skipped
348 steps passed, 21 failed, 1014 skipped, 0 undefined
Took 13m2.976s |
1168f15
to
25cbc79
Compare
25cbc79
to
eaefc3e
Compare
@jamshale heads up on the test failures Self-attested
Presenting revoked cred
AnonCreds issuance and revocation
|
Yeah, those test failures are a real head-scratcher for me... Doesn't seem to occur in other PRs. And I can't see what in this contribution will cause the problem. Maybe there's a way to print the ACA-Py agent logs as part of those Test workflows, see if that reveals anything. Only solution I can think of is to rebase and drop / re-add commits until I can identify what causes the problem 😅 |
Just remembered Jamie is out until Wednesday (I was hoping to just throw this his way lol); I'll take a deeper look at this. It should be possible to get some more logs on failures in the scenario tests, I think. |
I'm not back until Friday but quickly looked at the logs. 2 of them are failing when trying to receive the revocation webhook notification. Perhaps the log level affected the library checking for this event. One of the issues with using an external library for this is that we may need to get the library changed for changes like this. For the self attest test, I can't really think of a reason off the top of my head. Running an individual test with higher logging could help find the issue. I'd say because these tests are passing with main, this must either be an issue with the code or the acapy-controller library. |
An unfortunate side effect of this change is that pyright/pylance does not detect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spotted the issue. See comments.
While testing, I couldn't get the trace logs to print from the plugin registry. I tried setting the log level to "trace" (in the docker-compose for the anoncreds scenario test) and only saw debug and higher. Did I miss something? |
Hmm, that sucks... So, my initial idea was to try swap out the default python logger for loguru, which is much better and offers plenty of enhancements over the python logger. Loguru includes trace levels, and built in solutions for serialised logging and timed-file-logging config, etc, which was manually implemented here. What that would look like is defining a centralised loguru "get_logger" method somewhere, with all the acapy-config, and importing that, instead of using Some other nice features are colorized logging, and being able to bind context with logs. (e.g. I think that's still on the cards, because it really is more elegant to work with. But it does represent a big change to a core component (logging). And it'll take some effort to ensure everything remains identical. The potential bulk refactoring can cause some headaches for other maintainers with merge conflicts. And then there's effort in getting everyone on board with the change. So, at first I wanted to scrap the loguru idea, and just go for this trace addition... Buut, what do you think? Any potential appetite for switching to loguru? It'll take some more time, but I can take a swing at it. |
Hmm, it worked on my local with Example for loading
You can see that's why I didn't want it at debug level, because it'll clutter the existing, expected logs for debug. |
I'm not opposed to the idea in principle but definitely agree that it could be a bit of a messy change to integrate with ongoing work streams. How would this affect plugins? Would plugins also have to use loguru or is there a way to get loguru to integrate with Python's builtin logging? Are there any performance implications to loguru over builtin logging module? I think I'm also still a bit on the fence about the idea of a trace log level to begin with. Obviously it's not unheard of to have a level below debug if other loggers like loguru support it. Several of our rust based dependencies like Askar and Indy-VDR also support a trace log level. In the case of those dependencies, though, the inspect-ability of those dependencies is really low because they're compiled libraries. Python, on the other hand, being interpreted, it's pretty trivial to step into a debugger and see exactly what is going on when necessary. I'm a bit skeptical of the value of having the intense kind of detail being emitted to the trace level in these changes. I think we can definitely up our logging game in ACA-Py but the kind of change I think I'm more interested in is probably more effective/judicious use of the INFO level. 90% of the time when I'm scanning through logs, I'm looking for what messages were sent and received or what Admin API requests came in and their responses. In other words, I usually want to filter through less information but I have to use debug level right now or I don't get what I'm after. |
@dbluhm That makes sense. I think the trace level can be omitted... and instead, just add some sensible debug logs, and start the initiative to move more of the useful logs to info level. The primary motivator here was to debug an issue with an agent not starting up at all. It was from testing out a fix for the RuntimeWarning from the IndyVdrLedgerPool (#3328), and there were no debug logs or errors to indicate what was causing the agent to hang. Then I realised there is actually quite a lot of steps in the startup logic with no logging whatsoever. So, I just went at it, adding some sensible logging steps throughout. But, that made debug logging too verbose ... and figured, that is what trace is for. Fortunately, I've already achieved the goal of isolating what was causing the issue. So, I'd say this particular way of implementing trace logging is optional. And, Loguru should be the preferred way to implement trace functionality. I do think Loguru is worth it. The performance improvement is probably negligible (until they implement it in C), and for plugins it shouldn't be any issue. I'm imagining we can define a central, configured logging instance, similar to how we've done here - and then all cases of import logging
LOGGER = logging.getLogger(__name__) can just be replace-all'd with from acapy_agent.config.logging import get_logger
LOGGER = get_logger(__name__) Then everything else remains the same. Using standard python logging would still be possible - it just won't benefit from the loguru features. So it would just require letting all the devs know, hey, use get_logger instead. I think it's one of those "nice-to-haves" that once it's in place, it's a godsend, and makes developing so much easier -- lazy logging, context binding, colorized logging, and much easier configuration. There's just the effort in proving that it's a seamless shift, and that all existing logging configuration can be ported to the new way of doing things. Difficult to forecast how much effort that'll be. But I can try chip away at it, sometime over the next several weeks. So, over the next while I'll try clean up this PR to just contribute some reasonable addition to the startup logs, and if it proves seamless enough, loguru can come later :-) |
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
836f8d6
to
f342646
Compare
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
…tion stacktrace Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
…stacktrace Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Signed-off-by: ff137 <[email protected]>
Quality Gate failedFailed conditions |
I'll wrap this one up sometime soon. I think these are worthwhile changes, which we've pulled into our agent. It helps identify some potential improvements, which, unresolved leads to some noisy debug logs. Will see if I can improve on that |
🚧 WIP - to be refactored
List of changes:
Comparison of start up logs before and after to be included.
Partially resolves #3202
Note:
ClassLoader.load_module
is called. We see the log "Module ... is already loaded" 500 times during startup. And because I've added more verbose debug logging, there are 5 logs between trying to load it and recognising that it's already loaded. So, currently my log changes will introduce 2500 unnecessary log lines. As a solution, I'll implement caching to the load_module function, so it doesn't execute unnecessarily.