Download all the files in the website

Bharath picture Bharath · Aug 7, 2017 · Viewed 11.9k times · Source

I need to download all the files under this links where only the suburb name keep changing in each link

Just a reference https://www.data.vic.gov.au/data/dataset/2014-town-and-community-profile-for-thornbury-suburb

All the files under this search link: https://www.data.vic.gov.au/data/dataset?q=2014+town+and+community+profile

Any possibilities?

Thanks :)

Answer

naren picture naren · Aug 7, 2017

You can download file like this

import urllib2
response = urllib2.urlopen('http://www.example.com/file_to_download')
html = response.read()

To get all the links in a page

from bs4 import BeautifulSoup

import requests
r  = requests.get("http://site-to.crawl")
data = r.text
soup = BeautifulSoup(data)

for link in soup.find_all('a'):
    print(link.get('href'))