I need to implement a spell checker in C. Basically, I need all the standard operations... I need to be able to spell check a block of text, make word suggestions and dynamically add new words to the index.
I'd kind of like to write this myself, tho I really don't know where to begin.
Read up on Tree Traversal. The basic concept is as follows:
A really short example:
Dictionary:
apex apple appoint appointed
Tree: (*
indicates valid end of word)
update: Thank you to Curt Sampson for pointing out that this data structure is called a Patricia Tree
A -> P -> E -> X*
\\-> P -> L -> E*
\\-> O -> I -> N -> T* -> E -> D*
Document:
apple appint ape
Results:
A -> P -> P
, but the second P
does not have an I
child node, so the search fails.
E
node in A -> P -> E
does not have the "valid end of word" flag set.
edit: For more details on spelling suggestions, look into Levenshtein Distance, which measures the smallest number of changes that must be made to convert one string into another. The best suggestions would be the dictionary words with the smallest Levenshtein Distance to the incorrectly spelled word.