tobias.henn
tobias.henn

Reputation: 225

Split NSString into NSArray by blank lines

I am reading a *.srt subtitle file into a NSString. The content of this string looks like this:

1
00:00:20,000 --> 00:00:24,400
Altocumulus clouds occur between six thousand

2
00:00:24,600 --> 00:00:27,800
and twenty thousand feet above ground level.

I am looking for an elegant solution to split this string into an NSArray in which each element contains the information which is related to one particular subtitle-"frame", e.g. the zeroth element would look like this:

1
00:00:20,000 --> 00:00:24,400
Altocumulus clouds occur between six thousand

Any ideas how to accomplish this task in an elegant manner? I tried splitting the original string using the method

[string componentsSeparatedByString:@"\n\n"];

but this method fails to detect the blank lines..

Thanks for your help!

tobi

Upvotes: 1

Views: 3568

Answers (3)

rob mayoff
rob mayoff

Reputation: 385700

If [string componentsSeparatedByString:@"\n\n"] doesn't work, then there are two possibilities:

  1. Your file contains MSDOS-style line breaks, which are \r\n. So try splitting on @"\r\n\r\n".

  2. Your supposedly blank lines contain spaces or tabs. You can check this from the shell using cat -e.

Upvotes: 6

omz
omz

Reputation: 53551

I'd suggest using NSScanner instead. It's more flexible and you don't have to worry about whether your line breaks are Windows or Unix style and whether the blank lines contain any spaces. Here's an example:

NSMutableArray *lines = [NSMutableArray array];
NSString *s = @"foo\n\nbar\r\n  \t  \r\nbaz"; //intentionally mixed line breaks
NSScanner *scanner = [NSScanner scannerWithString:s];
while (![scanner isAtEnd]) {
    [scanner scanCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:NULL];
    NSString *line = nil;
    [scanner scanUpToCharactersFromSet:[NSCharacterSet newlineCharacterSet] intoString:&line];
    if (line) {
        [lines addObject:line];
    }
}
NSLog(@"%@", lines);

Upvotes: 4

UIAdam
UIAdam

Reputation: 5313

According to http://en.wikipedia.org/wiki/SubRip, the line breaks are a CRLF, which would be \r\n.

Upvotes: 0

Related Questions