skip to Main Content

How to make an offline copy of a website

To create an offline browseable copy of a website you can use the tool wget.
I will guide you through the steps on how to create a offline copy based on your needs.

Download the website

wget --mirror --no-check-certificate -e robots=off --timestamping --recursive --level=inf \
--no-parent --page-requisites --convert-links --adjust-extension --backup-converted -U Mozilla \
--reject-regex feed -R '*.gz, *.tar, *.mp3, *.zip, *.flv, *.mpg, *.pdf' http://test.com

Change into the directory of the offline copy:

cd www.test.domain

Clean up temp files

wget craetes some temp files, remove them:

find . -type f -name '*.orig' | xargs -n1 rm -f

Optimize images

If you want, you can convert JPEG images to a lower quality to save disk space:

find . -iname '*.jpg' | xargs -n1 mogrify -strip -quality 20

PNG files can be converted to JPEG files, but you have to keep the same filename with png ending to not break the offline website.

find . -name '*.png' | xargs -n1 mogrify -strip -quality 20 -format jpg
find . -name '*.PNG' | xargs -n1 mogrify -strip -quality 20 -format jpg
# rename the png files that got converted to jpg back to png
find . -name '*.png' -exec sh -c 'mv `dirname "$0"`/`basename "$0" .png`.jpg $0' '{}' \;
find . -name '*.PNG' -exec sh -c 'mv `dirname "$0"`/`basename "$0" .PNG`.jpg $0' '{}' \;

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top