wikionary API - meaning of words

M238 picture M238 · Nov 14, 2010 · Viewed 9.8k times · Source

I would like get meaning of selected word using wikionary API. Content retrieve data should be the same as is presented in "Word of the day", only the basic meaning without etympology, Synonyms etc.. for example

"postiche n Any item of false hair worn on the head or face, such as a false beard or wig."

I tried use documentation but i can find similar example, can anybody help with this problem?

Answer

PleaseStand picture PleaseStand · Nov 14, 2010

Although MediaWiki has an API (api.php), it might be easiest for your purposes to just use the action=raw parameter to index.php if you just want to retrieve the source code of one revision (not wrapped in XML, JSON, etc., as opposed to the API).

For example, this is the raw word of the day page for November 14:

http://en.wiktionary.org/w/index.php?title=Wiktionary:Word_of_the_day/November_14&action=raw

What's unfortunate is that the format of wiki pages focuses on presentation (for the human reader) rather than on semantics (for the machine), so you should not be surprised that there is no "get word definition" API command. Instead, your script will have to make sense of the numerous text formatting templates that Wiktionary editors have created and used, as well as complex presentational formatting syntax, including headings, unordered lists, and others. For example, here is the source code for the page "overflow":

http://en.wiktionary.org/w/index.php?title=overflow&action=raw

There is a "generate XML parse tree" option in the API, but it doesn't break much of the presentational formatting into XML. Just see for yourself:

http://en.wiktionary.org/w/api.php?action=query&titles=overflow&prop=revisions&rvprop=content&rvgeneratexml=&format=jsonfm

In case you are wondering whether there exists a parser for MediaWiki-format pages other than MediaWiki, no, there isn't. At least not anything written in JavaScript that's currently maintained (see list of alternative parsers, and check the web sites of the two listed ones). And even then, supporting most/all of the common templates will be a big challenge. Good luck.