So I've been playing around with raw WSGI, cgi.FieldStorage and file uploads. And I just can't understand how it deals with file uploads.
At first it seemed that it just stores the whole file in memory. And I thought hm, that should be easy to test - a big file should clog up the memory!.. And it didn't. Still, when I request the file, it's a string, not an iterator, file object or anything.
I've tried reading the cgi module's source and found some things about temporary files, but it returns a freaking string, not a file(-like) object! So... how does it fscking work?!
Here's the code I've used:
import cgi
from wsgiref.simple_server import make_server
def app(environ,start_response):
start_response('200 OK',[('Content-Type','text/html')])
output = """
<form action="" method="post" enctype="multipart/form-data">
<input type="file" name="failas" />
<input type="submit" value="Varom" />
</form>
"""
fs = cgi.FieldStorage(fp=environ['wsgi.input'],environ=environ)
f = fs.getfirst('failas')
print type(f)
return output
if __name__ == '__main__' :
httpd = make_server('',8000,app)
print 'Serving'
httpd.serve_forever()
Thanks in advance! :)
Inspecting the cgi module description, there is a paragraph discussing how to handle file uploads.
If a field represents an uploaded file, accessing the value via the value attribute or the
getvalue()
method reads the entire file in memory as a string. This may not be what you want. You can test for an uploaded file by testing either the filename attribute or the file attribute. You can then read the data at leisure from the file attribute:
fileitem = form["userfile"]
if fileitem.file:
# It's an uploaded file; count lines
linecount = 0
while 1:
line = fileitem.file.readline()
if not line: break
linecount = linecount + 1
Regarding your example, getfirst()
is just a version of getvalue()
.
try replacing
f = fs.getfirst('failas')
with
f = fs['failas'].file
This will return a file-like object that is readable "at leisure".