Ruby - How to get the name of a file with open-uri?

ElektroStudios picture ElektroStudios · Nov 15, 2012 · Viewed 8.4k times · Source

I want to download a music file by this way:

require 'open-uri'

source_url = "http://soundcloud.com/stereo-foo/cohete-amigo/download"

attachment_file = "test.wav"

open(attachment_file, "wb") do |file|  
  file.print open(source_url).read
end

In that example I want to change "Test.wav" to the real file name (like for example JDownloader program does).

EDIT: I don't mean the temporal file, I mean the stored file in the web like Jdownloader gets: "Cohete Amigo - Stereo Foo.wav"

Thankyou for read

UPDATE:

I've tried this to store the name:

attachment_file = File.basename(open(source_url))

I think that has no sense but i don't know the way to do it, sorry.

Answer

Casper picture Casper · Nov 15, 2012

The filename is stored in the header field named Content-Disposition. However decoding this field can be a little bit tricky. See some discussion here for example:

How to encode the filename parameter of Content-Disposition header in HTTP?

For open-uri you can access all the header fields through the meta accessor of the returned File class:

f = open('http://soundcloud.com/stereo-foo/cohete-amigo/download')
f.meta['content-disposition']
=> "attachment;filename=\"Stereo Foo - Cohete Amigo.wav\""

So in order to decode something like that you could do this:

cd = f.meta['content-disposition'].
filename = cd.match(/filename=(\"?)(.+)\1/)[2]
=> "Stereo Foo - Cohete Amigo.wav"

It works for your particular case, and it also works if the quotes " are not present. But in the more complex content-disposition cases like UTF-8 filenames you could get into a little trouble. Not sure how often UTF-8 is used though, and if even soundcloud ever uses UTF-8. So maybe you don't need to worry about that (not confirmed nor tested).

You could also use a more advanced web-crawling framework like Mechanize, and trust it to do the decoding for you:

require 'mechanize'

agent = Mechanize.new
file = agent.get('http://soundcloud.com/stereo-foo/cohete-amigo/download')
file.filename
=> "Stereo_Foo_-_Cohete_Amigo.wav"