Friday, November 4, 2011

Recursive download from ftp directory with wget without copying the entire tree structure

A simple and fast way to download recursive directories with wget. Suppose that you want to download the entire directory "igblast" from http://mirrors.vbi.vt.edu/mirrors/ftp.ncbi.nih.gov/blast/temp/igblast/.
If you execute the following command you will get the entire directory tree structure beginning from 'mirrors.vbi.vt.edu' (only the structure, not the files because of the argument --no-parent).

To get "igblast" files and directories only, execute the following command:

Where -nH remove "http://mirrors.vbi.vt.edu/" and --cut_dirs remove the others directories.

4 comments:

  1. Hello Thiago,

    Today I arrived to your blog while I was looking for a standalone version of igblast. I used your recursive download to obtain igblast, however while trying to run it I found out that the main file, igblastn was not an executable. Have you used igblastn before? Do you have any suggestions?


    I appreciate any kind of help and thank you in advance,

    Carlos

    ReplyDelete
  2. Hi Carlos,

    I just downloaded igblast from FTP and used igblastn in my Linux machine (Archlinux, Kernel 3.1.1). Are you using Linux? Did you try to change the file permission to execute?

    ReplyDelete
  3. Hi Thiago,

    Thanks for your quick reply. Now I feel my ears growing like those of a donkey. It was just what you mentioned, forgot to change the file permissions.

    Thanks a lot!

    ReplyDelete
  4. Thank you!!!! I've been searching for this command set for a long time!

    Most times people allude to the option to download a directory, have it keep its structure, *and* leave off the parent directory tree structure but never spell it out like you did.

    I have <1gb files that need transferring for work and wget is a lifesaver so I don't tie up my machine.

    ReplyDelete