aggsol
aggsol

Reputation: 2510

How to extract podcast URLs from XML feed with xsltproc?

I wanto to extract URLs from podcast feeds with xsltproc (or any other tool I can use in Bash). There the following two types of XML feeds.

Type A

<rss xmlns:media="http://search.yahoo.com/mrss/">
    <channel>
    <title>Podcast</title>
    <item>
        <title>Episode</title>
        <media:content url="http://example.org/example.mp3" fileSize="1234" type="audio/mpeg"/>
    </item>
    </channel>
</rss>

Type B

<rss>
    <channel>
    <title>Podcast</title>
    <item>
        <title>Episode</title>
        <guid>episode::x</guid>
        <enclosure type="image/jpeg" url="http://example.org/coverart.jpg"/>
        <enclosure type="audio/mpeg" url="http://example.net/audio.mp3"/>
    </item>
    </channel>
</rss>

I have the following stylesheet that returns the URLs from type B but not from type A. Can I even mix those two in one stylesheet?

<?xml version="1.0"?>
<stylesheet version="1.0" xmlns="http://www.w3.org/1999/XSL/Transform">
    <output method="text"/>
    <template match="/">
        <for-each select = "rss/channel/item/enclosure">
            <value-of select="@url"/><text>&#10;</text>
        </for-each>
        <for-each select = "rss/channel/item/media">
            <value-of select="@url"/><text>&#10;</text>
        </for-each>
    </template>
</stylesheet>

Upvotes: 0

Views: 374

Answers (1)

Aniket V
Aniket V

Reputation: 3247

In Type A XML, there's a namespace associated with the <content> node having alias as media. The namespace is not included in the stylesheet. It needs to be included in the stylesheet so that the elements associated with the namespace are accessed correctly.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
    xmlns:media="http://search.yahoo.com/mrss/"
    exclude-result-prefixes="media">

Inside the template, the for-each loop should be media:content, (content element is missing).

<xsl:for-each select="//media:content">
    <xsl:value-of select="@url" />
    <xsl:text>&#10;</xsl:text>
</xsl:for-each>

Upvotes: 2

Related Questions