Here is my code:
import csv
import requests
with requests.Session() as s:
s.post(url, data=payload)
download = s.get('url that directly download a csv report')
This gives me the access to the csv file. I tried different method to deal with the download:
This will give the the csv file in one string:
print download.content
This print the first row and return error: _csv.Error: new-line character seen in unquoted field
cr = csv.reader(download, dialect=csv.excel_tab)
for row in cr:
print row
This will print a letter in each row and it won't print the whole thing:
cr = csv.reader(download.content, dialect=csv.excel_tab)
for row in cr:
print row
My question is: what's the most efficient way to read a csv file in this situation. And how to download it.
thanks
This should help:
import csv
import requests
CSV_URL = 'http://samplecsvs.s3.amazonaws.com/Sacramentorealestatetransactions.csv'
with requests.Session() as s:
download = s.get(CSV_URL)
decoded_content = download.content.decode('utf-8')
cr = csv.reader(decoded_content.splitlines(), delimiter=',')
my_list = list(cr)
for row in my_list:
print(row)
Ouput sample:
['street', 'city', 'zip', 'state', 'beds', 'baths', 'sq__ft', 'type', 'sale_date', 'price', 'latitude', 'longitude']
['3526 HIGH ST', 'SACRAMENTO', '95838', 'CA', '2', '1', '836', 'Residential', 'Wed May 21 00:00:00 EDT 2008', '59222', '38.631913', '-121.434879']
['51 OMAHA CT', 'SACRAMENTO', '95823', 'CA', '3', '1', '1167', 'Residential', 'Wed May 21 00:00:00 EDT 2008', '68212', '38.478902', '-121.431028']
['2796 BRANCH ST', 'SACRAMENTO', '95815', 'CA', '2', '1', '796', 'Residential', 'Wed May 21 00:00:00 EDT 2008', '68880', '38.618305', '-121.443839']
['2805 JANETTE WAY', 'SACRAMENTO', '95815', 'CA', '2', '1', '852', 'Residential', 'Wed May 21 00:00:00 EDT 2008', '69307', '38.616835', '-121.439146']
[...]
Related question with answer: https://stackoverflow.com/a/33079644/295246
Edit: Other answers are useful if you need to download large files (i.e. stream=True
).