Extract GET URIs or their responses from Wireshark capture to separate file(s)

TheLostOne picture TheLostOne · Jun 17, 2012 · Viewed 18.3k times · Source

Issue

I use Wireshark to capture a HTTP video stream and I've use the following filter to filter out the relevant GET requests.

http.request.uri contains "identifier" && http.request.method == "GET" && ip.addr == xxx.xxx.xxx.xxx

Questions

  1. Is it possible to extract all get GET URLs to separate a .txt file?

  2. Or is possible to extract the raw response packets (without the header) which match the filter above to separate files so that I have a bunch of individual video files eventually?

I hope I made myself clear enough ;-)

Thank you

Answer

mavam picture mavam · Jun 18, 2012

While this may be doable with Wireshark, it is orders of magnitude easier with Bro.

Extracting URIs

Simply run it with your trace file:

bro -r <trace>

This invocation generates a bunch of log files in the current directory. The one you are interested in is http.log. You can filter the output to obtain only the GET requests:

bro-cut id.orig_h id.resp_h method host uri < http.log | awk '$3 == "GET"'

Example output:

192.168.1.104   212.96.161.238  GET update.avg.com  /softw/90/update/avg9infowin.ctf
192.168.1.104   77.67.44.206    GET backup.avg.cz   /softw/90/update/u7avi1777u1705ff.bin
192.168.1.104   198.189.255.75  GET aa.avg.com  /softw/90/update/u7iavi2511u2510ff.bin
192.168.1.104   77.67.44.206    GET backup.avg.cz   /softw/90/update/x8xplsb2_118c8.bin

As you can see, the last two columns make up the full URL. To remove the space in-between, you could use awk to concatenate the last two fields.

Extracting Files

Note: the upcoming Bro 2.1 release will have major improvements for file extractions. Until then, you can extract all files from a HTTP stream by specifying the MIME type of the files to store:

bro -r <trace> 'HTTP::extract_file_type = /video\/avi/'

Bro sniffs the MIME type of a HTTP body and if it matches the regular expression /video\/avi/, it creates a file with the prefix http-item. You can change the prefix name by redefining the HTTP::extraction_prefix variable.