Reputation: 11
Rome API does not parse the image URL if the URL is given within the CDATA section. For example, http://www.espn.com/espn/rss/espnu/news this feed has
<image>
<![CDATA[
URL of the image
]]>
</image>
Within the SyndFeed resulting from SyndFeedInput, I have checked the foreignMarkups, enclosures, DCModules.
value of other elements, such as Description and Title are also given within the CDATA, and Rome API is able to parse these values.
code snippet
XmlReader xmlReader = null;
try {
xmlReader = new XmlReader(new URL("http://www.espn.com/espn/rss/espnu/news"));
SyndFeedInput input = new SyndFeedInput();
SyndFeed feed = input.build(xmlReader);
} catch (Exception e) {
e.printStackTrace();
}
Upvotes: 0
Views: 762
Reputation: 11
I looked into the API in more details. The API provides plugins to override the parsing https://rometools.github.io/rome/RssAndAtOMUtilitiEsROMEV0.5AndAboveTutorialsAndArticles/RssAndAtOMUtilitiEsROMEPluginsMechanism.html
I wrote a class that extends RSS20Parser implements WireFeedParser and override the parseItem method
@Override
public Item parseItem(Element rssRoot, Element eItem, Locale locale) {
Item item = super.parseItem(rssRoot, eItem, locale);
Element imageElement = eItem.getChild("image", getRSSNamespace());
if (imageElement != null) {
String imageUrl = imageElement.getText();
Element urlElement = imageElement.getChild("url");
if(urlElement != null)
{
imageUrl = urlElement.getText();
}
Enclosure e = new Enclosure();
e.setType("image");
e.setUrl(imageUrl);
item.getEnclosures().add(e);
}
return item;
}
Now in SyndFeed, access the enclosures list and you will be able to find the image URL
List<SyndEntry> entries = feed.getEntries();
for (SyndEntry entry : entries) {
...
...
List<SyndEnclosure> enclosures = entry.getEnclosures();
if(enclosures!=null) {
for(SyndEnclosure enclosure : enclosures) {
if(enclosure.getType()!=null && enclosure.getType().equals("image")){
System.out.println("image URL : "+enclosure.getUrl());
}
}
}
}
and create a rome.properties file which is accessible in classpath with following entry
WireFeedParser.classes=your.package.name.CustomRomeRssParser
Upvotes: 0