Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize rendering of results by avoiding console.print #2559

Open
williballenthin opened this issue Jan 17, 2025 · 6 comments
Open

optimize rendering of results by avoiding console.print #2559

williballenthin opened this issue Jan 17, 2025 · 6 comments
Labels
performance Related to capa's performance

Comments

@williballenthin
Copy link
Collaborator

using line-profiler, I found that functions like render_feature can be fairly expensive, taking many seconds (cumulatively) to emit their results to the terminal. Digging into this further, it seems that rich's console.print() is relatively slow, taking around 100x longer than intermediate string constructions. This means that when we do a lot of little console.print calls on parts of a line then performance is poor.

We can optimize this by constructing complete regions or lines up front, and then flushing to the terminal with console.print. I think that rich's Text.append is still not very fast, but its better than doing a terminal write.

Originally we used a StringIO-based strategy of building a large output document and then flushing it in one go (rich-unaware). We might want to migrate back in this direction a little bit.

@williballenthin williballenthin added the performance Related to capa's performance label Jan 17, 2025
@williballenthin
Copy link
Collaborator Author

then again, the output for my test file was a few thousands of lines, which is not the typical case (nor wanted), so maybe this is not really worth spending time on.

@fariss
Copy link
Collaborator

fariss commented Jan 18, 2025

Originally we used a StringIO-based strategy of building a large output document and then flushing it in one go (rich-unaware). We might want to migrate back in this direction a little bit.

I think that's feasible. Rich allows capturing output via StringIO.

@williballenthin
Copy link
Collaborator Author

good point. i'm not sure if the performance issue is:

  • writing to stdout, or
  • the overhead of rich

writing to a console backed by StringIO and then flushing it at the end sounds pretty nice and also easy to try.

@williballenthin
Copy link
Collaborator Author

@fariss turns out we're already using the stdout capture functionality of rich:

with console.capture() as capture:

I can try doing an explicit StringIO strategy too. Otherwise I'll need to dig into the overhead of rich and see how we can avoid it.

@williballenthin
Copy link
Collaborator Author

using StringIO directly with a console instance didn't make a meaningful difference in runtime. Therefore, my suspicion is that rich's console.print is expensive, but I need to dig into this and prove it (and why).

@fariss
Copy link
Collaborator

fariss commented Jan 21, 2025

The assessment about console.print() being relatively slow is probably correct (a console.print() call has to parse style markups, compute ANSI color codes, ... etc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Related to capa's performance
Projects
None yet
Development

No branches or pull requests

2 participants