python requests - encoding with 'idna' codec failed (UnicodeError: label empty or too long) error

user1874064 picture user1874064 · Aug 17, 2018 · Viewed 10.3k times · Source

An api call I have been using with the requests package is suddenly returning the following error: "UnicodeError: encoding with 'idna' codec failed (UnicodeError: label empty or too long)"

and I have no clue how to fix this. My code looks like the following, with certain credentials faked for this example:

api_key= '123abc'
password = '12345' #password that only idiots use on their luggage
shop_name = 'myshopname'
shop_url = 'https://%s:%s@%s.myecommercesite.com/admin/customers/1234567.json' %(api_key, password, shop_name)

a = requests.get(shop_url)

when I print the shop_url and paste it into my browser, I get the data returned that I am expecting in a json. But when I run this request, I get the idna codec error.

This used to work without problem, but something changed somewhere apparently, and I'm not sure if it is with the ecommerce site or with requests or what that is causing this.

Has anyone encountered this type of error or know how to fix it?

if I print the url, it would look like: https://123abc:[email protected]/admin/customers/1234567.json

edit2: forgot to include %(api_key, password, shop_name) on my code example edit: entire error message below:

UnicodeError                              Traceback (most recent call last)
~/anaconda3/lib/python3.6/encodings/idna.py in encode(self, input, errors)
    164                 if not (0 < len(label) < 64):
--> 165                     raise UnicodeError("label empty or too long")
    166             if len(labels[-1]) >= 64:

UnicodeError: label empty or too long

The above exception was the direct cause of the following exception:

UnicodeError                              Traceback (most recent call last)
<ipython-input-15-f834b116b751> in <module>()
----> 1 a = requests.get(shop_url)

~/anaconda3/lib/python3.6/site-packages/requests/api.py in get(url, params, **kwargs)
     70 
     71     kwargs.setdefault('allow_redirects', True)
---> 72     return request('get', url, params=params, **kwargs)
     73 
     74 

~/anaconda3/lib/python3.6/site-packages/requests/api.py in request(method, url, **kwargs)
     56     # cases, and look like a memory leak in others.
     57     with sessions.Session() as session:
---> 58         return session.request(method=method, url=url, **kwargs)
     59 
     60 

~/anaconda3/lib/python3.6/site-packages/requests/sessions.py in request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
    497 
    498         settings = self.merge_environment_settings(
--> 499             prep.url, proxies, stream, verify, cert
    500         )
    501 

~/anaconda3/lib/python3.6/site-packages/requests/sessions.py in merge_environment_settings(self, url, proxies, stream, verify, cert)
    670             # Set environment's proxies.
    671             no_proxy = proxies.get('no_proxy') if proxies is not None else None
--> 672             env_proxies = get_environ_proxies(url, no_proxy=no_proxy)
    673             for (k, v) in env_proxies.items():
    674                 proxies.setdefault(k, v)

~/anaconda3/lib/python3.6/site-packages/requests/utils.py in get_environ_proxies(url, no_proxy)
    690     :rtype: dict
    691     """
--> 692     if should_bypass_proxies(url, no_proxy=no_proxy):
    693         return {}
    694     else:

~/anaconda3/lib/python3.6/site-packages/requests/utils.py in should_bypass_proxies(url, no_proxy)
    674     with set_environ('no_proxy', no_proxy_arg):
    675         try:
--> 676             bypass = proxy_bypass(netloc)
    677         except (TypeError, socket.gaierror):
    678             bypass = False

~/anaconda3/lib/python3.6/urllib/request.py in proxy_bypass(host)
   2610             return proxy_bypass_environment(host, proxies)
   2611         else:
-> 2612             return proxy_bypass_macosx_sysconf(host)
   2613 
   2614     def getproxies():

~/anaconda3/lib/python3.6/urllib/request.py in proxy_bypass_macosx_sysconf(host)
   2587     def proxy_bypass_macosx_sysconf(host):
   2588         proxy_settings = _get_proxy_settings()
-> 2589         return _proxy_bypass_macosx_sysconf(host, proxy_settings)
   2590 
   2591     def getproxies_macosx_sysconf():

~/anaconda3/lib/python3.6/urllib/request.py in _proxy_bypass_macosx_sysconf(host, proxy_settings)
   2560             if hostIP is None:
   2561                 try:
-> 2562                     hostIP = socket.gethostbyname(hostonly)
   2563                     hostIP = ip2num(hostIP)
   2564                 except OSError:

UnicodeError: encoding with 'idna' codec failed (UnicodeError: label empty or too long)

Answer

smallwat3r picture smallwat3r · Oct 10, 2018

It seems this is an issue from the socket module. It fails when the URL exceeds 64 characters. This is still an open issue https://bugs.python.org/issue32958

The error can be consistently reproduced when the first substring of the url hostname is greater than 64 characters long, as in "0123456789012345678901234567890123456789012345678901234567890123.example.com". This wouldn't be a problem, except that it doesn't seem to separate out credentials from the first substring of the hostname so the entire "[user]:[secret]@XXX" section must be less than 65 characters long. This is problematic for services that use longer API keys and expect their submission over basic auth.

There is an alternative solution:

It seems you're trying to use the Shopify API so I'll take it as an example.

Encode {api_key}:{password} in base64 and send this value in the headers of your request eg. {'Authorization': 'Basic {token_base_64}'}

See the example below:

import base64
import requests

auth = "[API KEY]:[PASSWORD]"
b64_auth = base64.b64encode(auth.encode()).decode("utf-8")

headers = {
    "Authorization": f"Basic {b64_auth}"
}

response = requests.get(
    url="https://[YOUR-SHOP].myshopify.com/admin/[ENDPOINT]",
    headers=headers
)