Wildcard for curl/wget to download xml feeds

Question

I want to download several feeds that are named as feed.xml, feed2.xml, feed3.xml etc. and have them appended to the same document.

My script below works, as it will check for 9 more pages (from 2 to 10). But I would like to use a wildcard instead of specifying a limit.

curl -lo ~/Desktop/feed.xml https://address/feed.xml && curl -s https://address/feed[2-10].xml >> ~/Desktop/feed.xml

The following two tries with wildcard fails for me and I am not sure what might be wrong.

With [2-*] or *:

curl -lo ~/Desktop/feed.xml https://address/feed.xml && curl -s https://address/feed[2-*].xml >> ~/Desktop/feed.xml


curl -lo ~/Desktop/feed.xml https://address/feed.xml && curl -s https://address/feed*.xml >> ~/Desktop/feed.xml

With ?:

curl -lo ~/Desktop/feed.xml https://address/feed.xml && curl -s https://address/feed?.xml >> ~/Desktop/feed.xml

Source: https://curl.haxx.se/libcurl/c/CURLOPT_WILDCARDMATCH.html

Ezphares · Accepted Answer

If you look at your link about curl's wildcard match you will see that:

This feature is only supported for FTP download.

The reason for this is simple: FTP servers are (usually) listable, so accessing ftp://address/ will provide a list of files that can be used to resolve the wildcard in something like ftp://address/feed*.xml

HTTP(S) does not inherently provide a way to list all resources at an address, so curl has no way to determine how many feeds exist.

IF the server does provides a directory of feeds (at some other url), you could request that first, and use that to generate the range. Otherwise, if the number of feeds will be relatively static, you might be better of manually providing the range as you do at the moment.

Wildcard for curl/wget to download xml feeds

Answers (1)

Related Questions