Easiest way to extract the urls from an html page using sed or awk only

codaddict picture codaddict · Dec 10, 2009 · Viewed 89.4k times · Source

I want to extract the URL from within the anchor tags of an html file. This needs to be done in BASH using SED/AWK. No perl please.

What is the easiest way to do this?

Answer

Hardy picture Hardy · Jan 4, 2010

You could also do something like this (provided you have lynx installed)...

Lynx versions < 2.8.8

lynx -dump -listonly my.html

Lynx versions >= 2.8.8 (courtesy of @condit)

lynx -dump -hiddenlinks=listonly my.html