Using wget to mirror entire website folder

GNU wget can create a very usable local or offline mirror of an entire folder of a website using the following combination of options.

From the output directory, use this invocation:

wget -r -l inf -k -N -np -p -E $URL

You can play around with the options:

-l inf
follow link infinitely
convert links after downloading
no parent
retrieve prerequisites of pages
page extensions according to MIME type

The files will be stored under hostname/path/to/file. If you know up front that the origin of the files is restricted to a particular host or even subpath, you can additionally use:

no subdir for host
strip N path components

One thought on “Using wget to mirror entire website folder

  1. Bruno

    Alternative when you want to download a flat Apache directory listing:

    wget -r -np -l1 -nd -N $URL

    The option -nd prevents the creation of the local directory structure.

    Afterwards you can probably remove index.html and robots.txt


Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>