I have a set of full urls like
http://en.wikipedia.org/wiki/Episkopi_Bay
http://en.wikipedia.org/wiki/Monte_Lauro
http://en.wikipedia.org/wiki/Lampedusa
http://en.wikipedia.org/wiki/Himera
http://en.wikipedia.org/wiki/Lago_Cecita
http://en.wikipedia.org/wiki/Aspromonte
I want to find wikipedia pageids for these URLS. I have used the Mediawiki API before but I cant figure out how I may do this.
I have tried extracting the page title from the URLs by taking a substring of lastindexof("/") and the last character and then querying the API to get pageid.
http://en.wikipedia.org/wiki/Episkopi_Bay --> Episkopi_Bay
http://en.wikipedia.org/wiki/Monte_Lauro --> Monte_Lauro
http://en.wikipedia.org/wiki/Lampedusa -- > Lampedusa
http://en.wikipedia.org/wiki/Himera --> Himera
http://en.wikipedia.org/wiki/Lago_Cecita --> Lago_Cecita
http://en.wikipedia.org/wiki/Aspromonte --> Aspromonte
But the problem is that some of my links might be redirects and hence the substring might not always be the title of the page.
TL;DR : How can I find the pageid of a wikipedia page from a URL ?
I’m not sure if what you call "page id" is the identification number of the page (e.g. 15580374 for English Wikipedia’s Main Page -- found on "Page information" in the toobox in left column) or the normalised title of a page with redirects resolved. The answer below will answer both.
You can use the API action=query, e.g. https://en.wikipedia.org/w/api.php?action=query&titles=Main%20Page where you will find minimal information, whose the page id (number).
You can also want to manage more complex cases: title normalisation and/or redirects. Title normalisation (initial capital, underscores changed to spaces, various unicode normalisations iirc, etc.) is included out-of-the box. For redirects, you have to ask specifically by adding "&redirects" to the URL (note that double redirects (=redirect of a redirect) won’t work, but the should not be out there). Example: https://en.wikipedia.org/w/api.php?action=query&titles=main_page&redirects
If you need more information, you can look at https://en.wikipedia.org/w/api.php?action=help&modules=query%2Binfo.