HeretoLearn
HeretoLearn

Reputation: 7434

Open source spell check

Was evaluating adding spell check to a product I own. As per my research the major decisions that need to be made:

  1. The library to use.
  2. Dictionary( this can be region specific, British english, American etc).
  3. Exclusion lists. Anytime a typo is detected its possible that its not a typo but is verbiage specific to the user. At this point the users should be given the ability to
    add this to his custom exclusion list.
  4. Besides a per user custom list also a list of exclusion based on the user space of the clients of the tool. That is terms/acronyms in the users work domain. For example FX will not be a typo for currency traders.

The open questions I had are listed below and if I could get input into them that would be very useful. For 1, I was thinking of hunspell, whcih is the open source library offered under MPL and is used by firefox and OpenOffice family of products. Any horror stories out there using this? Any grey areas with the licensing? The spell checking will happen on a windows client.

Dictionaries are available from a variety of sources some free under MPL while some are not. Any suggestions on good sources for free dictionaries.

Multi lingual support and what needs to be worked out to support them?

For 4, how are custom dictionaries kept in sync with the server side and the clientside? The spell check needs to happen on the clientside so are they pushed down with the initial launch everytime or are they synced up ever so often?

Upvotes: 14

Views: 12793

Answers (4)

Brijesh
Brijesh

Reputation: 796

Here is a good demonstration by Peter Norvig: I find this simple explanation much more intuitive. Follow the links in the doc as well for more indepth analysis.

http://norvig.com/spell-correct.html

Upvotes: 1

Thomas Maierhofer
Thomas Maierhofer

Reputation: 2681

As already mentioned Hunspell is a state of the art spell checker. It is the Open Office, Thunderbird, Firefox and Google Chrome spell checker. Ports to all major programming languages are available. It works with the Open Office Directories, so a lot of languages are supported.

Upvotes: 11

Artyom
Artyom

Reputation: 31233

There are several pupular options that widely used: myspell, aspell. Check them.

Upvotes: 2

Zifre
Zifre

Reputation: 26998

I've used Hunspell for a few things, and I don't really have any horror stories with it. I've only used it with English (American) though, but it claims to work with other languages.

As for licensing, it offers a choice of GPL, LGPL, and MPL. If you don't like the MPL, you can always choose to use the LGPL.

Upvotes: 3

Related Questions