Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow forbidden words if they're found in another dictionary #4042

Open
anthonyvdotbe opened this issue Jan 12, 2025 · 7 comments
Open

Allow forbidden words if they're found in another dictionary #4042

anthonyvdotbe opened this issue Jan 12, 2025 · 7 comments

Comments

@anthonyvdotbe
Copy link

To reproduce: with "cSpell.language": "en,nl", type the word technologies.

Actual: technologies is marked as forbidden because it is forbidden in the nl-nl dictionary.

Expected: technologies is not marked, since it's in the en-us and en-gb dictionaries.

@Jason3S
Copy link
Collaborator

Jason3S commented Jan 12, 2025

There is an easy workaround. Add the word in question to the ignore list.

cSpell.ignoreWords

Or try it out in a document by typing cspell:ignore technologies

No need to change code.

See: How to Forbid Words | CSpell

@Jason3S
Copy link
Collaborator

Jason3S commented Jan 13, 2025

@clivinn-shla81092,

Thank you for the AI suggestion. There is a bit more involved.

The spell checker looks at the problem from a higher level.

From a company level:

  1. As a company, I would like to flag offensive English words to prevent them from appearing in documentation, marketing material, or on our website. Flagging words provides that capability. Flagging a word must override a word in the dictionary.
  2. In order to send out the memo to the company about the flagged words, we need to be able to include them in the document. This is where ignoreWords is used. Ignoring words will allow the word in a document, but the word will not be in the list of spelling suggestions.

It is not a perfect solution, but it meets nearly everyone's needs.

@anthonyvdotbe
Copy link
Author

Thanks. The workaround is not entirely satisfactory though. Given "cSpell.language": "en,nl":

  • when I ignore technologies globally, I cannot, AFAIK, un-ignore it in a document with <!-- cspell: locale nl -->
  • when I ignore technologies locally, then:
    • I have to do so in each document where it occurs
    • if afterwards I change cSpell.language to just nl, those documents will now ignore a word they shouldn't

Either way, the above must be done for each applicable word. Moreover, the set of applicable words depends on the configured set of languages.

I'd propose CSpell to have a configurable strategy for this with 3 options:

  1. forbid if any language forbids it
  2. allow if any language allows it
  3. search the language list in order, and allow/forbid according to the first dictionary that contains it

So for technologies:

  • nl,en would forbid it with strategy 1 and 3
  • en,nl would forbid it with strategy 1

@Jason3S
Copy link
Collaborator

Jason3S commented Jan 14, 2025

@anthonyvdotbe,

Maybe not satisfactory, but sufficient. There are many options.

Using overrides you can choose which files to apply your settings.

"cSpell.overrides": [
  {
    "filename": ["**/*.md"],
    "language": "en,nl",
    "ignoreWords": ["technologies"]
  },
],

You can even define your own dictionary of words.

        "cSpell.dictionaryDefinitions": [
            {
                "name": "allowWordsInNL",
                "path": "./.cspell/allowWords.txt", // path to file
                "noSuggest": true, // overrides flagged words
                "addWords": true // this means it will show up on the list of places to add words.
            }
        ],
        "cSpell.overrides": [
            {
                "filename": "**/*.md",
                "language": "en,nl",
                "dictionaries": ["allowWordsInNL"]
            }
        ],

@anthonyvdotbe
Copy link
Author

What I'd like, is for a document with "cSpell.language": "en,nl":

  • to not mark technologies in any way
  • to suggest technologies when I type technologied

The proposed workarounds satisfy the first requirement, but not the second.

@Jason3S
Copy link
Collaborator

Jason3S commented Jan 19, 2025

@anthonyvdotbe,

Would it be clearer to add an option to show ignored words in the suggestions? Like includeIgnoredWordsInSuggestions.

Background to the overall logic

Some thought went into the existing system, it is not ideal, but there are not any real easy solutions.

Let's take the word incase. It is an old spelling of encase and a company name.
Sadly, it is in the US English dictionary. 99.9% of the time when someone writes incase they mean in case as in In case of fire and occasionally encase.

It is possible to add incase->in case,encase to cSpell.flagWords. This will flag incase as wrong and suggest in case or encase as a good option. But what if you work for Incase Inc., then it is necessary to be able to override that kind of setting. ignoreWords was the workaround to that issue.

Another example is the British dictionary contains both -ize and -ise version of words, because both are considered valid in the Oxford English dictionary. But, a company might want to forbid the common -ize words. In order to achieve this it is necessary for the precedence of flagWords to be stronger than the normal words in a dictionary. But, there are times when it is necessary to use flagged word, that is where ignoreWords comes in.

@anthonyvdotbe
Copy link
Author

Would it be clearer to add an option to show ignored words in the suggestions? Like includeIgnoredWordsInSuggestions.

If such an option would be added, I'd happily make use of it.

FWIW, I think the root issue is that there are multiple reasons to flag words, and that the spell checker is unable to distinguish the different reasons and take them into account.
If technologies is flagged in the nl dictionary, merely because it's a common misspelling, then with language set to en,nl, the spell checker should be able to figure out that there's no need to flag technologies. I understand this might be infeasible though, as it requires the dictionaries to also embed the reason for each flagged word etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants
@Jason3S @anthonyvdotbe and others