I am trying to use the requests library in Python to upload a file into Fedora commons repository on localhost. I'm fairly certain my main problem is not understanding open()
/ read()
and what I need to do to send data with an http request.
def postBinary(fileName,dirPath,url):
path = dirPath+'/'+fileName
print('to ' + url + '\n' + path)
openBin = {'file':(fileName,open(path,'rb').read())}
headers = {'Slug': fileName} #not important
r = requests.put(url, files=openBin,headers=headers, auth=HTTPBasicAuth('username', 'pass'))
print(r.text)
print("and the url used:")
print(r.url)
This will successfully upload a file in the repository, but it will be slightly larger and corrupted after. For example an image that was 6.6kb became 6.75kb and was not openable anymore.
So how should I properly open and upload a file using put in python?###Extra details:###
When I replace files=openBin
with data=openBin
I end up with my dictionary and I presume the data as a string. I don't know if that information is helpful or not.
"file=FILE_NAME.extension&file=TYPE89a%24%02Q%03%E7%FF%00E%5B%19%FC%....
and the size of the file increases to a number of megabytes
I am using specifically put because the Fedora RESTful HTTP API end point says to use put
.
The following command does work:
curl -u username:password -H "Content-Type: text/plain" -X PUT -T /path/to/someFile.jpeg http://localhost:8080/fcrepo/rest/someFile.jpeg
Updated
Using requests.put()
with the files
parameter sends a multipart/form-data encoded request which the server does not seem to be able to handle without corrupting the data, even when the correct content type is declared.
The curl
command simply performs a PUT with the raw data contained in the body of the request. You can create a similar request by passing the file data in the data
parameter. Specify the content type in the header:
headers = {'Content-type': 'image/jpeg', 'Slug': fileName}
r = requests.put(url, data=open(path, 'rb'), headers=headers, auth=('username', 'pass'))
You can vary the Content-type
header to suit the payload as required.
Try setting the Content-type
for the file.
If you are sure that it is a text file then try text/plain
which you used in your curl
command - even though you would appear to be uploading a jpeg file? However, for a jpeg image, you should use image/jpeg
.
Otherwise for arbitrary binary data you can use application/octet-stream
:
openBin = {'file': (fileName, open(path,'rb'), 'image/jpeg' )}
Also it is not necessary to explicitly read the file contents in your code, requests
will do that for you, so just pass the open file handle as shown above.