How to download pdf files using Python?

tiredandsarcastic picture tiredandsarcastic · May 10, 2017 · Viewed 16.2k times · Source

I was looking for a way to download pdf files in python, and I saw answers on other questions recommending the urllib module. I tried to download a pdf file using it, but when I try to open the downloaded file, a message shows up saying that the file cannot be opened.

error message

This is the code I used-

import urllib
urllib.urlretrieve("http://papers.gceguide.com/A%20Levels/Mathematics%20(9709)/9709_s11_qp_42.pdf", "9709_s11_qp_42.pdf")

What am I doing wrong? Also, the file automatically saves to the directory my python file is in. How do I change the location to which it gets saved?

Edit- I tried again with the link to a sample pdf, http://unec.edu.az/application/uploads/2014/12/pdf-sample.pdf

The code is working with this link, so why won't it work for the other one?

Answer

Fensa Saj picture Fensa Saj · Aug 14, 2017

Try this. It works.

import requests
url='https://pdfs.semanticscholar.org/c029/baf196f33050ceea9ecbf90f054fd5654277.pdf'
r = requests.get(url, stream=True)

with open('C:/Users/MICRO HARD/myfile.pdf', 'wb') as f:
f.write(r.content)