It seems like Wikipedia API's definition of a link is different from URL? I'm trying to use the API to return all the urls in a specific wiki page.
I have been playing around with this query that I found from this page under generators and redirects.
I'm not sure why exactly are you confused (it would help if you explained that), but I'm quite sure that query is not what you want. It lists links (prop=links
) on pages that are linked (generator=links
) from the page “Title” (titles=Title
). It also lists only the first page of links on the first page of links (with page size the tiny default value of 10).
If you want to get all the links on the page “Title”:
prop=links
, you don't want the generator.pllimit=max
(pl
is the “prefix” for links
)query-continue
element to get to the second (and following) page of results.So, the query for the first page would be:
http://en.wikipedia.org/w/api.php?action=query&titles=Title&prop=links&pllimit=max
And the second (and in this case, final) page:
Another thing that might be confusing you is that links
returns only internal links (to other Wikipedia pages). To get external links, use prop=extlinks
. You can also combine the two into one query:
http://en.wikipedia.org/w/api.php?action=query&titles=Title&prop=links|extlinks