[Scanner]: Introduce a Scanner class to handle file logic before checkers #63

c0nst4ntin · 2025-01-09T22:40:35Z

What:

Bug Fix
New Feature
Refactor

Description:

Note

This is just a first draft! I am looking for active feedback on this, so we can develop this feature together. So before you open a new pull request with a slightly different version, feel free to write a review or suggestion here. I am more than open to accepting your suggestions on this topic!

This pull request adds a new Scanner class as introduced in #22 which handles the file scanning and then calls the checkers for each class.

In #22 it was proposed to also create the reflection in the scanner, in order to only have to create one reflection, which we then can process in multiple classes. However, at the moment we have one checker that checks the file's path and name and one checker that uses reflection.

I already had to add a function supports() in the Checker Interface to know when to call which checker.

Important

At the moment the tests obviously fail, as I changed the interfaces and checkers so the initialization and tests are all messed up. Before fixing the test suite I would love to get your feedback to further develop this feature

we can filter out:
- ignored words from config
- small words like "a", "the", "as", etc.
- technical words from Feature: Technical Dictionary #43
repeated words don't even hit the cache (one InMemorySpellchecker->check call per allowed word, even if it appears in many files)
we're setting things up to be able to create a Fixer class that could go through all the occurrences of each word.

Peck would then work vaguely in these stages:

kernel registers scanners/checkers/output/fixer classes
command calls the kernel->getWords or kernel->scanner->getWords to get filtered list of words to check
command loops through words
- sends them to the InMemorySpellchecker
- if result is a Misspelling, pass it to the registered output (or fixer) class
if no misspellings found at all, output success

This would also decouple fixing/output from the kernel/checker/scanner,.

What to you think?

benjam-es

This would need updating to the latest version of the main branch as a base.

I am not quite following the flow if this at the moment. Does this do any caching?

The idea for a scanner for me is that a checker would use a scanner, but you would have a couple of scanners (class and file system) and many checkers (file system, class, class method, class constant etc). With that in mind, the scanner would run first and have a cache, so the classes are scanned once, and then the checkers run using those cached words to spell check.

This then lends itself to future expansion where we could filter unique on the scanner results to use 'aspell' less, and we could also look at stats on repeated word count which can be used for suggestions for ignore words or preset words

benjam-es · 2025-01-17T23:44:52Z

I wonder if the package has changes a lot since this. Perhaps switch to an issue and revisit?

refactor (scanner): introduce new scanner and move file logic away fr…

cefa5e7

…om checkers

benjam-es suggested changes Jan 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Scanner]: Introduce a Scanner class to handle file logic before checkers #63

[Scanner]: Introduce a Scanner class to handle file logic before checkers #63

c0nst4ntin commented Jan 9, 2025

loki495 commented Jan 10, 2025

benjam-es left a comment

benjam-es commented Jan 17, 2025

[Scanner]: Introduce a Scanner class to handle file logic before checkers #63

Are you sure you want to change the base?

[Scanner]: Introduce a Scanner class to handle file logic before checkers #63

Conversation

c0nst4ntin commented Jan 9, 2025

What:

Description:

Related:

loki495 commented Jan 10, 2025

benjam-es left a comment

Choose a reason for hiding this comment

benjam-es commented Jan 17, 2025