Reputation: 1054
Working in C# I have an array of strings. Some of these strings are real words, others are complete nonsense. My goal is to come up with a way of deciding which of these words are real and which are false.
I had planned to find some kind of word list online that I could bring into my project, turn into a list, and compare against, but of course typing in "C# dictionary" comes up with an unrelated topic! I don't need a 100% accuracy rate.
To formalize the question: In C#, what is the recommended way to establish whether or not a string is a real word?
Advice and guidance is very much appreciated!
Solution
Thanks for the great answers, they were all very useful. As it happens the thing to do was ask the same question in a different wording. Searching for C# spellcheck brought up some great links and I ended up using Nhunspell which you can get through NuGet, and is very easy to use.
Upvotes: 0
Views: 130
Reputation: 1294
The problem is that "Dictionary" is a type within the framework. So, searching with that word will end up with all sorts of results. What you are basically wanting to do is Spell Check. This will determine if a word is valid or not.
Searching for C# spell check yielded some promising results. Searching for open source spell check also has some.
I have previously implemented one of the open source ones within a VB6 project. I think it was ASpell. I haven't had to use spell check library within C#, but I'm sure there is one, or at least one with a .NET wrapper to make implementation easier.
If you have special case words that do not exist in the dictionary/word file for a spell check solution, you can add them.
Upvotes: 1
Reputation: 36524
I don't know of any word list file included by default on Windows, but most Unix-like operating systems include a words
file for this purpose. Someone has also posted a words file on github suggested for use in Windows projects. These files are simple lists of words, one per line.
Upvotes: 1
Reputation: 1651
To do this I would use a freely available dictionary for linux (googling "linux dictionaries" should get you on the right track), read and parse the file, and store it in a C# System.Collections.Generic.HashSet collection. I would probably store everything as .ToUpper() or as .ToLower() but this depends on your requirements.
You can then check if any arbitrary string is in the HashSet efficiently.
Upvotes: 1