httpmirror - Mirror Linux distributions using http

here is the deal...

You are at work. Your network admin uses restrictive firewall rules. You can't use rsync or ftp to create a local mirror of your favorite Linux distribution. Your only door to Linux mirrors is through http. What to do?

I took this as an opportunity to brush up on my rusty python scripting skills and cobbled together httpmirror.

httpmirror is a roughly made script that lets you mirror linux distributions. There are two main functionalities provided by httpmirror:

  1. Recursively download all the files from a Linux http mirror.

  2. Recursively download a list of all the files from a Linux http mirror.

Recursive downloading of files is experimental so use this if you have a very reliable connection and you feel adventurous. Having said that, resuming works here whereas it doesn't when you are only creating a list of files.

Alternatively you might want to only generate a list of files to download and then feed that list to your favorite download program. This is the recommended way.

To get a better picture of what you can do with httpmirror here is the output from httpmirror if you don't provide any command-line arguments

a@a:~/Desktop/python-to-go$ ./httpmirror.py
Usage: httpmirror.py [options]

Options:
-h, --help  show this help message and exit
-x NOLIST   a comma-separated list of words that should be avoided
-m URL      The base url. This will download all the files
-l URLLIST  The base url. This will only save the list of files
-o FILE     The file output should you decide to only download the file list
  • The -x option allows the use of exceptions. Oftentimes you need to exclude certain files/directories from being fetched. For example if you are only interested in the i386 arch distribution and want to exclude amd64, sparc, and powerpc you'd add the following '-x amd64,sparc,powerpc'

  • You MUST specify either -m with a url or -l with a url.

  • The -o option is optional and only useful if you want to generate a list of files to download and you want to specify a custom file name. The default is out.txt

  • To use a proxy server the 'http_proxy' environment variable must be set. (in linux you'd enter 'export http_proxy=http://myproxy:port').

FAQs:

Q. I can make wget do the same thing!

A. Sure you can but you won't be able to filter out files/directories using pre-set criteria.

Download and Run

To run the script in linux 'chmod +x ./httpmirror.py' and then run. In Windows you simply 'python httpmirror.py'

Change Log
0.0.3 - 23-May-2008
* Exceptions are now optional
* minor bug fix (the os.linesp() issue)

Download httpmirror (0.0.3) 5.2 KB


I have downloaded this file,

I have downloaded this file, but after trying, it didn't work, what could be the problem.
general contractors chicago

Recent comments

Valium 5mg TabletsBrand Name Cialis OnlinePhentermine 37.5mg Side EffectsPhentermine Hcl 37.5mgOrder Valium no PrescriptionOvernight Delivery CialisLow Cost ViagraGeneric Viagra in The usTramadol Online Overnight