AwesomeBen1
AwesomeBen1

Reputation: 75

How to download multiple files whose URLs are stored in an xml document?

I have an xml document on my computer that basically looks like this:

<item playlist="3" gameid="32" catid="1" title="Cul-De-Sac of Memories" artist="Christopher Lennertz" scr="../mp3/sims3p/build/cul-de-sac_of_memories.flv" />

<item playlist="3" gameid="30" catid="4" title="Brave" artist="Kelis" scr="../mp3/sims3ln/electronica/brave.flv" />

<item playlist="3" gameid="15" catid="1" title="First Volley" artist="General Midi" scr="../mp3/sims2nl/build/general_midi_-_first_volley.flv" />

Except that it has much more items than that (and some comments). I've been desperately trying to find a way to get a program/script to:

  1. Take the url in between src=" and the very next " in the xml tag.
  2. Replace the ../ in the url with http://www.WEBSITE.com/ and maybe store it as a variable, like Song_URL.
  3. Take the song name in between title=" and the very next " from the same tag it got the url from, and maybe store that a variable too, like Song_Name.
  4. Download the song from the Song_URL and name it Song_Name.

For every tag like that in the document. Note that some tags in the document look like this: <item playlist="2" gameid="28" catid="2" title="Load" /> and don't matter to me.

I know a tiny bit of Bash, Applescript, and Python, but don't know enough of any to do this. If anyone could please help me do this, in whatever programming language you please (it could be in the 3 I listed, or in Java, Ruby, C or whatever else you want), however you want, I would very much appreciate it!

Upvotes: 4

Views: 1663

Answers (2)

AwesomeBen1
AwesomeBen1

Reputation: 75

I figured it out how to do this with help from a friend. After using a basic text program to replace all instances of ../ with http://www.WEBSITE.com/ I used the following program to download the songs:

import urllib
F = open('/PATH/TO/FILE.txt')
document = F.readlines()
for string in document:
    index1 = string.find('scr="')+5
    index2 = string.find('"',index1)
    Song_url = string[index1:index2]
    
    index3 = string.find('title="')+7
    index4 = string.find('"',index3)
    Song_name = string[index3:index4]
    
    u = urllib.urlopen(Song_url)
    localFile = open((Song_name + '.flv'),'w')
    localFile.write(u.read())
    localFile.close()

And it worked like a charm.

Upvotes: 0

X Zhang
X Zhang

Reputation: 307

I don't know how to use python to resolve this problem. But it seems you need an XML parser lib to extract the wanted tags. Then using some string operations to get your desired URL. Finally get your mp3 from the URL.

I am quite sure you can finish your job in python. But if you do not mind to handle it in Java, this site describe some XML parser libs. I think any described lib will satisfy your need. Once getting the url, you can read the song just like read your local file though the following code:

URL url = new URL("your song url");
url.openConnection();
InputStream reader = url.openStream();

Hope that helps.

Upvotes: 1

Related Questions