Wasim Aftab
Wasim Aftab

Reputation: 808

wget a specific file using regex in directory and file names

I am trying to download ftp://ftp.flybase.net/releases/current/dmel_r6.21/fasta/dmel-all-translation-r6.21.fasta.gz via wget. I am running this command: wget ftp://ftp.flybase.net/releases/current/dmel_r*/fasta/dmel-all-translation-r*.fasta.gz

what I'm doing wrong? Thanks

Upvotes: 1

Views: 953

Answers (1)

Wasim Aftab
Wasim Aftab

Reputation: 808

I solved it using following shell script

#!/bin/bash

[ -d Test_Wget ] && (echo "directory exists,  changing directory for child process") || (mkdir Test_Wget && echo "directory created, changing directory for child process")
cd Test_Wget

if [ ! -f index.html ];
then
    wget --no-parent -A 'dmel_r*/fasta/dmel-all-translation-r*.fasta.gz' ftp://ftp.flybase.net/releases/current/
fi

awk '/href.*dmel/' index.html > url_with_crap

grep -o '".*"' url_with_crap > url_with_quotes

part_url=$(sed -e 's/^"//' -e 's/"$//' < url_with_quotes)

url="$part_url/fasta/dmel-all-translation-r*.fasta.gz"

wget $url

gunzip dmel-all-translation-r*.fasta.gz

shopt -s extglob
rm -- !(dmel-all-translation-r*.fasta)

I am sure this is too naive solution for the problem. I am waiting for an elegant reply

Upvotes: 1

Related Questions