I am fetching some pages over the Web using Perl's LWP::UserAgent
and would like to be as polite as possible. By default, LWP::UserAgent
does not seamlessly handle compressed content via gzip. Is there an easy way to make it do so, to save everyone some bandwidth?
LWP has this capability built in, thanks to HTTP::Message
. But it's a bit hidden.
First make sure you have Compress::Zlib
installed so you can handle gzip
. HTTP::Message::decodable()
will output a list of allowed encodings based on the modules you have installed; in scalar context, this output takes the form a comma-delineated string that you can use with the 'Accept-Encoding
' HTTP header, which LWP
requires you to add to your HTTP::Request
-s yourself. (On my system, with Compress::Zlib
installed, the list is "gzip
, x-gzip
, deflate
".)
When your HTTP::Response
comes back, be sure to access the content with $response->decoded_content
instead of $response->content
.
In LWP::UserAgent
, it all comes together like this:
my $ua = LWP::UserAgent->new;
my $can_accept = HTTP::Message::decodable;
my $response = $ua->get('http://stackoverflow.com/feeds',
'Accept-Encoding' => $can_accept,
);
print $response->decoded_content;
This will also decode text to Perl's unicode strings. If you only want LWP
to uncompress the response, and not mess with the text, do like so:
print $response->decoded_content(charset => 'none');