how to test open-uri url exist before processing any data

Kostas picture Kostas · Aug 24, 2011 · Viewed 8.1k times · Source

I'm trying to process content from a list of links using "open-uri" in ruby (1.8.6), but the bad thing happens when I'm getting an error when one link is broken or requires authentication:

open-uri.rb:277:in `open_http': 404 Not Found (OpenURI::HTTPError)
from C:/tools/Ruby/lib/ruby/1.8/open-uri.rb:616:in `buffer_open'
from C:/tools/Ruby/lib/ruby/1.8/open-uri.rb:164:in `open_loop'
from C:/tools/Ruby/lib/ruby/1.8/open-uri.rb:162:in `catch' 

or

C:/tools/Ruby/lib/ruby/1.8/net/http.rb:560:in `initialize': getaddrinfo: no address associated with hostname. (SocketError)
from C:/tools/Ruby/lib/ruby/1.8/net/http.rb:560:in `open'
from C:/tools/Ruby/lib/ruby/1.8/net/http.rb:560:in `connect'
from C:/tools/Ruby/lib/ruby/1.8/timeout.rb:53:in `timeout'

or

C:/tools/Ruby/lib/ruby/1.8/net/protocol.rb:133:in `sysread': An existing connection was forcibly closed by the remote host. (Errno::ECONNRESET)
from C:/tools/Ruby/lib/ruby/1.8/net/protocol.rb:133:in `rbuf_fill'
from C:/tools/Ruby/lib/ruby/1.8/timeout.rb:62:in `timeout'
from C:/tools/Ruby/lib/ruby/1.8/timeout.rb:93:in `timeout'

is there a way to test the response (url) before processing any data?

the code is:

require 'open-uri'

smth.css.each do |item| 
 open('item[:name]', 'wb') do |file|
   file << open('item[:href]').read
 end
end

Many thanks

Answer

dogenpunk picture dogenpunk · Aug 24, 2011

You could try something along the lines of

    require 'open-uri'

    smth.css.each do |item|
     begin 
       open('item[:name]', 'wb') do |file|
         file << open('item[:href]').read
       end
     rescue => e
       case e
       when OpenURI::HTTPError
         # do something
       when SocketError
         # do something else
       else
         raise e
       end
      rescue SystemCallError => e
       if e === Errno::ECONNRESET
        # do something else
       else
        raise e
       end
     end
   end

I don't know of any way of testing the connection without opening it and trying, so rescuing these errors would be the only way I can think of. The thing to be aware of is that OpenURI::HTTPError and SocketError are both subclasses of StandardError, whereas Errno::ECONNRESET is a subclass of SystemCallError. So rescue => e won't catch Errno::ECONNRESET.