Is there a way to recover an entire website from the waybackmachine?
I have an old site that is archived but no longer have the website files to revive it again. Is there a way to recover the old data so I can get my long lost files back?
wget is a great tool to mirror an entire site and if you are on windows, you can use Cygwin to install it. The following command will mirror a site: wget -m domain.name
The example wget command that the wont ascend to the parent dir (-np
), ignores robot.txt (-e robots=off
), uses the cdn domain (--domains=domain.name
), and mirrors a url (the url to mirror, http://an.example.com
). All together you get:
wget -np -e robots=off --mirror --domains=staticweb.archive.org,web.archive.org http://web.archive.org/web/19970708161549/http://www.google.com/
If you are dealing with https
and a self signed cert, u can use --no-check-certificate
to disable the certificate check. The wget help is the best place to see possible options.