I am trying to read files using Python's ftplib without writing them. Something roughly equivalent to:
def get_page(url):
try:
return urllib.urlopen(url).read()
except:
return ""
but using FTP.
I tried:
def get_page(path):
try:
ftp = FTP('ftp.site.com', 'anonymous', 'passwd')
return ftp.retrbinary('RETR '+path, open('page').read())
except:
return ''
but this doesn't work. The only examples in the docs involve writing files using the ftp.retrbinary('RETR README', open('README', 'wb').write)
format. Is it possible to read ftp files without writing first?
Well, you have the answer right in front of you: The retrbinary method accepts as second parameter a reference to a function that is called whenever file content is retrieved from the ftp connection.
Here is a simple example:
#!/usr/bin/env python
from ftplib import FTP
def writeFunc(s):
print "Read: " + s
ftp = FTP('ftp.kernel.org')
ftp.login()
ftp.retrbinary('RETR /pub/README_ABOUT_BZ2_FILES', writeFunc)
You should implement writeFunc so that it actually appends the data read to an internal variable, something like this, which uses a callable object:
#!/usr/bin/env python
from ftplib import FTP
class Reader:
def __init__(self):
self.data = ""
def __call__(self,s):
self.data += s
ftp = FTP('ftp.kernel.org')
ftp.login()
r = Reader()
ftp.retrbinary('RETR /pub/README_ABOUT_BZ2_FILES', r)
print r.data
Update: I realized that there is a module in the Python standard library that is meant for this kind of things, StringIO:
#!/usr/bin/env python
from ftplib import FTP
from io import StringIO
ftp = FTP('ftp.kernel.org')
ftp.login()
r = StringIO()
ftp.retrbinary('RETR /pub/README_ABOUT_BZ2_FILES', r.write)
print r.getvalue()
Update 2: StringIO has been rolled into io. Incorporated @TimRichardson's comment.: