mirror single page with httrack

Max picture Max · Dec 28, 2009 · Viewed 23.8k times · Source

I am trying to use httrack (http://www.httrack.com/) in order to download a single page, not the entire site. So, for example, when using httrack in order to download www.google.com it should only download the html found under www.google.com along with all stylesheets, images and javascript and not follow any links to images.google.com, labs.google.com or www.google.com/subdir/ etc.

I tried the -w option but that did not make any difference.

What would be the right command?

EDIT

I tried using httrack "http://www.google.com/" -O "./www.google.com" "http://www.google.com/" -v -s0 --depth=1 but then it wont copy any images.

What I basically want is just downloading the index file of that domain along with all assets, but not the content of any external or internal links.

Answer

Sourav Ghosh picture Sourav Ghosh · Jan 19, 2015
httrack "http://www.google.com/" -O "./www.google.com" "http://www.google.com/" -v -s0  --depth=1 -n

-n option (or --near) will download images on a webpage no matter where it is located.

Say images are located in google.com/foo/bar/logo.png. as, you are using s0(stay on same directory), it will not download the image unless you specify --near