Where can I obtain an English dictionary with structured data?

Portman picture Portman · Sep 25, 2010 · Viewed 21.4k times · Source

I would like to download an English dictionary -- not just a word list -- in a structured format such as TXT, XML, or SQL.

Specifically, I need phonetic pronunciation and parts of speech (definition is not required).

Surprisingly, I can't find this online anywhere. Wiktionary is available for download, but it is only the MediaWiki articles themselves. Crawling all articles and extracting the phonetics and parts of speech would be a huge exercise.

Is this available anywhere? I don't mind paying.

Edit: a few people have asked what I would like to do. My immediate need is just curiosity, for example "what the most common two-syllable verbs?". Eventually my hope would be a tool that helps you find available domain names, and does so by pairing the correct parts of speech, with bonus points for phonetic matches.

Note: cross-posted on English Language and Usage.

Answer

matthuhiggins picture matthuhiggins · Sep 30, 2010

Go to http://www.speech.cs.cmu.edu/cgi-bin/cmudict and you will find the download page for the pronunciation dictionary at https://cmusphinx.svn.sourceforge.net/svnroot/cmusphinx/trunk/cmudict/

The latest version is currently cmudict.0.7a.

This is what I am currently using to implement the syllable counter for http://www.haikuvillage.com. It's in Ruby and I'd be happy to open source it for you if that helps.