Rafi
Rafi

Reputation: 1922

iOS NSXMLParser - Consistently Derive Image Source URL From XML Tag

I'm working with RSS feeds in my app, specifically with Drudge Report's. I'm quite new to this sort of stuff, along with being new to using Xcode's NSXMLParser. Each feed apparently represents an article. Each feed is represented by the <item></item> tags.

Within these tags, there's a description of info enclosed by the <description></description> tags. In the description, some articles might have an image associated with that article, as seen in the following screenshot:

enter image description here

The part I highlighted is the image I need to get (specifically, the URL string). I'm able to derive the description each article as an NSMutableString, but how do I derive the image's URL when I parse the XML with NSXMLParser? The following is my code so far as to how I'm getting all of this done:

@interface ViewController () <NSXMLParserDelegate, UITableViewDataSource, UITableViewDelegate> {
    NSXMLParser *parser;
    NSMutableArray *feeds;
    NSMutableDictionary *item;
    NSMutableString *title;
    NSMutableString *link;
    NSMutableString *description;
    NSString *element;
}
.
.(other code)
.
#pragma mark - NSXMLParserDelegate

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict
{
    element = elementName;
    if ([element isEqualToString:@"item"]) {
        item        = [[NSMutableDictionary alloc] init];
        title       = [[NSMutableString alloc] init];
        link        = [[NSMutableString alloc] init];
        description = [[NSMutableString alloc] init];
    }
}

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string {

    if ([element isEqualToString:@"title"]) {
        [title appendString:string];
    }
    else if ([element isEqualToString:@"feedburner:origLink"]) {
        [link appendString:string];
    }
    else if ([element isEqualToString:@"description"]) {
        [description appendString:string];
    }
}

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName {

    if ([elementName isEqualToString:@"item"]) {
        NSString *filteredTitle = [title stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        NSString *filteredLink = [link stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];

        if (![filteredLink containsString:@"https://itunes.apple.com/"]) {
            [item setObject:filteredTitle forKey:@"title"];
            [item setObject:filteredLink forKey:@"link"];
            [item setObject:description forKey:@"description"];

            [feeds addObject:[item copy]];
        }
    }
}

- (void)parserDidEndDocument:(NSXMLParser *)parser {
    [self.tableView reloadData];
}

PROGRESS

So far, I added the following in my didEndElement method:

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName {

    if ([elementName isEqualToString:@"item"]) {
        NSString *filteredTitle = [title stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        NSString *filteredLink = [link stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];

        if (![filteredLink containsString:@"https://itunes.apple.com/"]) {
            [item setObject:filteredTitle forKey:@"title"];
            [item setObject:filteredLink forKey:@"link"];
            [item setObject:description forKey:@"description"];
            if ([description rangeOfString:@"img style"].location != NSNotFound)
            {

            }

            [feeds addObject:[item copy]];
        }
    }
}

Now that I know that the description has the img style string in it, I need to get the src="whateverImageURL". How do I use a regular expression to get the first occurrence of this image URL?

Upvotes: 0

Views: 289

Answers (3)

BEN MESSAOUD Mahmoud
BEN MESSAOUD Mahmoud

Reputation: 736

you have to implement this protocol

- (void)parser:(NSXMLParser *)parser foundAttributeDeclarationWithName:(NSString *)attributeName forElement:(NSString *)elementName type:(nullable NSString *)type defaultValue:(nullable NSString *)defaultValue;

this allow you to get all attribute for each element found.

Let me know if this help you :)

UPDATE

Here a code that find the url of first img found in a given string

 NSString *descriptionString = @"&lt;br&gt;&lt;tt&gt;&lt;font size=\"3\" color=\"blue\"&gt;&lt;b&gt;&lt;u&gt;LIST: 10 Worst Winter Storms in Washington History...&lt;/u&gt;&lt;/b&gt;&lt;/font&gt;&lt;/tt&gt;&lt;br&gt;&lt;br&gt;&lt;br&gt;&lt;font face=\"Arial\" size=\"1\"&gt;&lt;i&gt;(Top headline, 3rd story, &lt;a href=\"http://www.nbcwashington.com/news/local/Ten-Worst-Storms-in-DC-History-365815301.html\"&gt;link&lt;/a&gt;)&lt;/i&gt;&lt;/font&gt;&lt;hr style=\"height: 1px; border-style: none; color: #666666; background-color: #666666;\"/&gt;&lt;font face=\"Arial\" size=\"2\"&gt;Related stories:&lt;div class=\"related-links\" id=\"R:H1:S3\"&gt;&lt;a href=\"http://www.wunderground.com/US/DC/001.html#WIN\"&gt;BLIZZARD WARNING ISSUED FOR DC; BURBS UP TO 30\"...&lt;/a&gt;&lt;br&gt;&lt;a href=\"http://washington.cbslocal.com/2016/01/19/winter-is-finally-here-deep-freeze-and-snow-in-the-forecast/\"&gt;Mayor Requests Help From National Guard...&lt;/a&gt;&lt;br&gt;&lt;a href=\"http://www.accuweather.com/en/weather-news/snow-storm-travel-disruptions-aim-for-nyc-dc-boston-philadelphia-friday-saturday/54870622\"&gt;UPDATE...&lt;/a&gt;&lt;br&gt;&lt;a href=\"http://www.infowars.com/snowmaggedon2016-empty-store-shelves-as-panicked-shoppers-ransack-grocery-stores/\"&gt;Anxious Shoppers Ransack Grocery Stores...&lt;/a&gt;&lt;br&gt;&lt;a href=\"http://motherboard.vice.com/read/dark-web-users-are-worried-snowstorm-jonas-will-disrupt-their-deliveries\"&gt;Dark Web Users Fear Delivery Disruptions...&lt;/a&gt;&lt;br&gt;&lt;a href=\"https://www.washingtonpost.com/news/to-your-health/wp/2016/01/21/heres-why-some-people-drop-dead-while-shoveling-snow/\"&gt;Cold weather, shoveling form heart attack 'perfect storm'...&lt;/a&gt;&lt;br&gt;&lt;/div&gt;&lt;/font&gt;&lt;br&gt;&lt;div class=\"feedflare\"&gt;    &lt;a href=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?a=Mtf4NlmV8XU:vDGXzaysxPw:yIl2AUoC8zA\"&gt;&lt;img src=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?d=yIl2AUoC8zA\" border=\"0\"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?a=Mtf4NlmV8XU:vDGXzaysxPw:V_sGLiPBpWU\"&gt;&lt;img src=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?i=Mtf4NlmV8XU:vDGXzaysxPw:V_sGLiPBpWU\" border=\"0\"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?a=Mtf4NlmV8XU:vDGXzaysxPw:qj6IDK7rITs\"&gt;&lt;img src=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?d=qj6IDK7rITs\" border=\"0\"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?a=Mtf4NlmV8XU:vDGXzaysxPw:gIN9vFwOqvQ\"&gt;&lt;img src=\"http://feeds.feedburner.com/~ff/DrudgeReportFeed?i=Mtf4NlmV8XU:vDGXzaysxPw:gIN9vFwOqvQ\" border=\"0\"&gt;&lt;/img&gt;&lt;/a&gt; &lt;/div&gt;&lt;img src=\"http://feeds.feedburner.com/~r/DrudgeReportFeed/~4/Mtf4NlmV8XU\" height=\"1\" width=\"1\" alt=\"\"/&gt";
NSString *stringWithoutWhiteSpace = [descriptionString stringByReplacingOccurrencesOfString:@" " withString:@""];
NSInteger srcLocation = [stringWithoutWhiteSpace rangeOfString:@"src="].location;
if ( srcLocation!= NSNotFound) {
    NSString *firstSrcImg = [stringWithoutWhiteSpace substringFromIndex:srcLocation];
    NSArray *componment = [firstSrcImg componentsSeparatedByString:@"\""];
    NSString *url = componment[1];
    NSLog(@"%@", url);
}

i invite you to try it and tell me if it respond to your question ... i can give another code that return all img urls :)

SECOND UPDATE For the example i have done here a method that you can use:

- (NSString*) getNextURLFromString:(NSString*) str withURLTag:(NSString*) urlTag{
NSString *stringWithoutWhiteSpace = [str stringByReplacingOccurrencesOfString:@" " withString:@""];
NSInteger srcLocation = [stringWithoutWhiteSpace rangeOfString:urlTag].location;
if ( srcLocation!= NSNotFound) {
    NSString *firstSrcImg = [stringWithoutWhiteSpace substringFromIndex:srcLocation];
    NSArray *componment = [firstSrcImg componentsSeparatedByString:@"\""];
    NSString *url = componment[1];
    return url;
}
return nil;
}

for the urlTag param put @"src=" and for the str param put the description tag value

UPDATE N° 3

here a method that return all images url

- (NSArray*) getAllURLFromString:(NSString*) str withURLTag:(NSString*) urlTag{
NSMutableArray *result = [NSMutableArray array];
NSString *stringWithoutWhiteSpace = [str stringByReplacingOccurrencesOfString:@" " withString:@""];
NSInteger srcLocation = [stringWithoutWhiteSpace rangeOfString:urlTag].location;
if ( srcLocation!= NSNotFound) {
    NSString *firstSrcImg = [stringWithoutWhiteSpace substringFromIndex:srcLocation];
    NSArray *componment = [firstSrcImg componentsSeparatedByString:@"\""];
    if ([componment count]>1) {
        NSString *url = componment[1];
        [result addObject:url];

        NSArray *nextComponent = [stringWithoutWhiteSpace componentsSeparatedByString:url];
        if ([nextComponent count]>1) {
            [result addObjectsFromArray:[self getAllURLFromString:nextComponent[1] withURLTag:urlTag]];
        }
    }

    return result;
}
return result;
}

for the urlTag param put @"src="

and for the str param put the description tag value

Upvotes: 0

Rafi
Rafi

Reputation: 1922

After some research, I've managed to solve my problem. I just needed a little practice with using NSRange. The idea is, in my case, that when I have a description that has the NSString "img style" in it, I know for a fact that I need the first "src="whateverImageURL" string that I can get. I do this in the following code:

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
    if ([elementName isEqualToString:@"item"]) {
        NSString *filteredTitle = [title stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];
        NSString *filteredLink = [link stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]];

        if (![filteredLink containsString:@"https://itunes.apple.com/"]) {
            [item setObject:filteredTitle forKey:@"title"];
            [item setObject:filteredLink forKey:@"link"];
            [item setObject:description forKey:@"description"];
            if ([description rangeOfString:@"img style"].location != NSNotFound) {
                NSString *finalImageURL;
                NSRange startRange = [description rangeOfString:@"src=\""];
                finalImageURL = [description substringFromIndex:startRange.location];
                finalImageURL = [finalImageURL substringFromIndex:startRange.length];
                NSRange endRange = [finalImageURL rangeOfString:@"\""];
                finalImageURL = [finalImageURL substringToIndex:endRange.location];
            }

            [feeds addObject:[item copy]];
        }
    }
}

Upvotes: 0

Shravan
Shravan

Reputation: 438

You'l have to do the following in ur

foundCharacters: method.

   else if ([element isEqualToString:@"description"]) 
{
        [description appendString:string];
if ([description rangeOfString:@"img"].location != NSNotFound)
    {
        NSRange firstRange = [previewImage rangeOfString:@"src="];
        NSRange endRange = [[previewImage substringFromIndex:firstRange.location] rangeOfString:@" width=\""];
        NSString *finalLink = [[NSString alloc] init];
        finalLink = [previewImage substringWithRange:NSMakeRange(firstRange.location, endRange.location)];
        NSString *match = @"src=\"";
        NSString *postMatch;
        NSScanner *scanner = [NSScanner scannerWithString:finalLink];
        [scanner scanString:match intoString:nil];
        postMatch = [finalLink substringFromIndex:scanner.scanLocation];
        NSString *finalURL = [postMatch stringByAppendingString:@""];
        description = finalURL;
    }
    }
}
  • Since in ur foundCharacters u are already getting the description tag u need to search for the text in ur description array where u append the string.
  • that u can do by scanning the entire string then store the required substring in a variable...i.e ur URL link
  • Use firstRange variable to set the range from where ull take the string
  • and endrange variable to set the text till where u want the string to end (in ur case the url)

Here i m storing the URL in previewImage.

Hope it works for u good luck.....

Upvotes: 1

Related Questions