Split checkers up #38

GrandadEvans · 2025-01-02T23:59:16Z

I am wanting to write the code for a system that will have an Interface, and mean each "checker" has it's own class (Open/Closed principle).
The peck.json file can have

{
  "checkers": [
    "file-names",
    "method-names",
    "class-names",
    "comments",
    "etc"
  ]
}

and then the checkers can be iterated through, a list of words compiled, and then checked

Class BaseChacker {
    public function __construct(
        public $wordsToCheck = [],
        public $checkersBasePath = './src/Checkers/',
    ) {
        foreach($checkers as $checker) {
            if ($this->ableToImportCheckerThroughResolution($checker)) {
                $this->addCheckersWordsToWordList();
            }
        }
    }

    private function ableToImportCheckerThroughResolution(string $checker): array {
        $checkerFileName = $checker->checkerFilename;
        $path = "{$this->checkersBasePath}{$checker}";

        if (!file_exists($path)) {
            throw new CheckerDoesNotExistException($checker);
        }
    }
}

I want to see the existing PRs I have through first, so that I can get some feedback, and get an up to date codebase, and then I know whether the changes I have so far can be implemented into the new changes.
Saying that I'll probably start work on it anyway

The text was updated successfully, but these errors were encountered:

GrandadEvans · 2025-01-03T00:03:31Z

The return type of array on the private method should be bool, and it should return true after it checks for the file existence, but I was making it up on the spot, and realised I had 3 minutes to go to get today's DuoLingo done (super-professional)

c0nst4ntin · 2025-01-03T17:51:47Z

I am also a big fan of splitting this up. It would allow for a better separation of concerns and fewer conflicts with multiple pull requests. We just need to make sure that we don't create a new reflection class instance for each checker.

I believe this was the proposal in #22 with the suggested scanner.

loki495 · 2025-01-07T12:49:39Z

I also like the idea of splitting up scanners, reflecting all the classes (once), and compiling the list of words before doing any of the actual spellchecking. Doing it in "stages" (i.e. scanning -> spellchecking -> output) would keep things separate and cleaner.

While scanning we could also handle trimming down the list of words that the checkers receive. For example:

we could process the ignored words and directories from the config at this stage, not in the checkers
we could ignore short or common words ("the", "in", "for","as" etc.) that sometimes are part of method or variable names
currently we're sometimes passing long lines to the InMemorySpellchecker (and the cache). A "checkIfMessageIsValid" method, will be passed as "check if a message is valid". A DocComment line gets passed as a whole. This makes the cache less effecive, plus many words like "param", "return", "object", "string" are not being filtered out. Each word should probably be added to the word list individually, and filtered out if necessary.
I also like the idea of the technical dictionary in Feature: Technical Dictionary #43, this would also help with the last point
with the new cache its not such a big deal anymore, at leat its not being processed by ASpell each time any more, but the same word will be passed multiple times to the InMemorySpellchecker and the cached file will be read from disk and unserialized each time (this could use it's own optimization, maybe an in-memory cache within the cache?).

And maybe we could even keep track of all occurrences (line and column) of each word in each file? It would eat up progressively more memory the larger the code base gets, but this would make a future "fixer" a trivial matter. Maybe the original word list gets discarded once the spellchecking stage is over, just keeping the misspellings info, or maybe some other level of caching might be needed.

loki495 · 2025-01-07T12:57:40Z

One more 5am thought: Maybe part of the issue is that the current naming is confusing? The "checker" interface could be the scanner, while the InMemorySpellChecker would be the actual singulat checker (with maybe a config setting to choose between ASpell or other?), so maybe we just need to clean up and consolidate the "checker" -> "scanner" interface? Make all the "checkers" just get the word list ("namesToCheck", and maybe line/column info), and pass it back to the kernel, do the word splitting and filtering, then pass it to the actual checker which handle the cache and return the misspellings and suggestions, then the kernel or command passes the list to either an output or a fixer class?

calvinludwig mentioned this issue Jan 9, 2025

fix: class checker inspecting parent classes #61

Closed

2 tasks

c0nst4ntin mentioned this issue Jan 9, 2025

[Scanner]: Introduce a Scanner class to handle file logic before checkers #63

Draft

3 tasks

nunomaduro added the help wanted Extra attention is needed label Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Split checkers up #38

Split checkers up #38

GrandadEvans commented Jan 2, 2025

GrandadEvans commented Jan 3, 2025

c0nst4ntin commented Jan 3, 2025

loki495 commented Jan 7, 2025

loki495 commented Jan 7, 2025

Split checkers up #38

Split checkers up #38

Comments

GrandadEvans commented Jan 2, 2025

GrandadEvans commented Jan 3, 2025

c0nst4ntin commented Jan 3, 2025

loki495 commented Jan 7, 2025

loki495 commented Jan 7, 2025