Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: parallel processing #119

Closed

Conversation

chr15k
Copy link
Contributor

@chr15k chr15k commented Jan 16, 2025

What:

  • Bug Fix
  • New Feature

Description:

Implemented parallel processing in the Aspell Spellchecker class, resulting in a 40% performance improvement on the first (non-cached) run.

Testing

To more easily test this, manually comment out caching in the Aspell class, run Pest while switching between main branch and this branch.

Test notes

Main branch with caching turned off

  • 12.22s
  • 11.97s
  • 11.98s

This branch with caching turned off

  • 8.20s
  • 8.26s
  • 8.24s

@chr15k chr15k changed the title refactor: parallel processing (40% performance improvement) refactor: parallel processing (40% first-run performance improvement) Jan 16, 2025
Copy link
Contributor

@benjam-es benjam-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be worth adding a test if possible?

@chr15k chr15k changed the title refactor: parallel processing (40% first-run performance improvement) refactor: parallel processing Jan 18, 2025
@chr15k
Copy link
Contributor Author

chr15k commented Jan 18, 2025

It may be worth adding a test if possible?

Hey @benjam-es, after adding more tests, I noticed that performance worsens with large datasets (e.g., if $text was 10,000+ lines), as it spawns n processes. I then tried chunking, which helped but was still slower than the main branch for this size. Given this, I actually don’t think it’s worth committing since the change isn't significant enough an improvement and adds extra complexity. I think if we can improve the first-run speed we'll need a better/different approach. I'll close this PR, thanks for looking!

@chr15k chr15k closed this Jan 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants