Wikipedia API - get random page(s)

Petter picture Petter · Nov 9, 2015 · Viewed 9.7k times · Source

I'm trying to get a JSON result with a set of random pages from Wikipedia, including their titles, content and images.

I've played around with their API sandbox, and so far the best I've got is this:

https://en.wikipedia.org/w/api.php?action=query&list=random&format=json&rnnamespace=0&rnlimit=10

But this only includes the namespace, id, and title of ten random pages. I would like to get the content as well as images as well.

Do anyone know how?

Alternatively I could do with the title, content and image url's of a single random page. Best I've got here is:

https://en.wikipedia.org/w/api.php?action=query&generator=random&format=json

Answer

svick picture svick · Nov 15, 2015

You're close. generator=random is the right way to go. You can then use various prop values to get the info you want:

  • Page title is always included.

  • To get the text, use prop=revisons along with rvprop=content.

  • To get all images used on the page, use prop=images.

    Note that this will often include images you're probably not interested in, like icons and flags. To fix that, you might try instead prop=pageimages, though it doesn't seem to work always. Or you could try using both.

So, the final query could look like this:

https://en.wikipedia.org/w/api.php?format=json&action=query&generator=random&grnnamespace=0&prop=revisions|images&rvprop=content&grnlimit=10