how to copy a website with httrack on linux

This is more for my own reference than anything. Say you see a flog on the intertubes and want to rip it and stick up for affiliate links. How to do it quickly on Linux? I used to use wget but it sucked. httrack is much better.

httrack "http://www.techcrunh.com/" -N1 -O "/home/techcrunch_rip/public_html" +techcrunch.com/* +crunchgear.com/* -v

This will rip the homepage of techcrunch and stick it in the folder specified by -O. URL filters next ensure it only downloads files from certain domains. The -N1 argument is the most important, it ensures htttrack sticks all images, css in one directory instead of creating loads of directories. Very handy.

Published by Georgie Casey

student. Google+

Leave a comment

Your email address will not be published. Required fields are marked *