Python requests call with URL using parameters

johan855 picture johan855 · Jul 20, 2016 · Viewed 32.7k times · Source

I am trying to make a call to the import.io API. This call needs to have the following structure:

'https://extraction.import.io/query/extractor/{{crawler_id}}?_apikey=xxx&url=http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35'

You can see in that call, the parameter "url" has to be also included:

http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35

It just so happens that this secondary URL also needs parameters. But if I pass it as a normal string like in the example above, the API response only includes the part before the first parameter when I get the API response:

http://www.example.co.uk/items.php?sortby=Price_LH

And this is not correct, it appears as if it would be making the call with the incomplete URL instead of the one I passed in.

I am using Python and requests to do the call in the following way:

import requests
import json

row_dict = {'url': u'http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35', 'crawler_id': u'zzz'}
url_call = 'https://extraction.import.io/query/extractor/{0}?_apikey={1}&url={2}'.format(row_dict['crawler_id'], auth_key, row_dict['url'])
r = requests.get(url_call)
rr = json.loads(r.content)

And when I print the reuslt:

"url" : "http://www.example.co.uk/items.php?sortby=Price_LH",

but when I print r.url:

https://extraction.import.io/query/extractor/zzz?_apikey=xxx&url=http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35

So in the URL it all seems to be fine but not in the response.

I tried this with other URLs and all get cut after the first parameter.

Answer

Demitri picture Demitri · Mar 27, 2018

The requests library will handle all of your URL encoding needs. This is the proper way to add parameters to a URL using requests:

import requests

base_url = "https://extraction.import.io/query/extractor/{{crawler_id}}"
params = dict()
params["_apikey"] = "xxx"
params["url"] = "http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35"

r = requests.get(base_url, params=params)
print(r.url)

An arguably more readable way to format your parameters:

params = {
    "_apikey" : "xxx",
    "url" : "http://www.example.co.uk/items.php?sortby=Price_LH&per_page=96&size=1%2C12&page=35"
}

Note that the {{crawler_id}} piece above is not a URL parameter but part of the base URL. Since Requests is not performing general string templating something else should be used to address that (see comments below).