Determine the difficulty of an english word

Techtwaddle picture Techtwaddle · Feb 28, 2011 · Viewed 11.8k times · Source

I am working a word based game. My word database contains around 10,000 english words (sorted alphabetically). I am planning to have 5 difficulty levels in the game. Level 1 shows the easiest words and Level 5 shows the most difficult words, relatively speaking.

I need to divide the 10,000 long words list into 5 levels, starting from the easiest words to difficult ones. I am looking for a program to do this for me.

Can someone tell me if there is an algorithm or a method to quantitatively measure the difficulty of an english word?

I have some thoughts revolving around using the "word length" and "word frequency" as factors, and come up with a formula or something that accomplishes this.

Answer

Martin DeMello picture Martin DeMello · Feb 28, 2011

Get a large corpus of texts (e.g. from the Gutenberg archives), do a straight frequency analysis, and eyeball the results. If they don't look satisfying, weight each text with its Flesch-Kincaid score and run the analysis again - words that show up frequently, but in "difficult" texts will get a score boost, which is what you want.

If all you have is 10000 words, though, it will probably be quicker to just do the frequency sorting as a first pass and then tweak the results by hand.