Reputation: 161
I would like to parse a string like this:
NSString *str = @"firstcolumn second column text Third Column Text";
I have three columns of text, each column could be text with spaces.
I know how wide the columns, col1 = 10 chars long, col2 = 20, col3 = 30
I know I could use NSRange(0,len1),(10,len2),(20,len3).
I get crashes 'Out of range" errors because the length varies, the length of the column text doesn't have to reach its max limit.
Any ideas how to do this?
NSString *str = @"A000 B11 This is text description This column is a longer Text description";
//A000 column can be 10 chars long
//B11 can be 20 chars
//This is some text description can be 30 characters long
NSString *code1 = [line substringWithRange:NSMakeRange(0,10)];
NSString *code2 = [line substringWithRange:NSMakeRange(10,20)];
NSString *shorttext = [line substringWithRange:NSMakeRange(20,20)];
NSString *longtext = [line substringWithRange:NSMakeRange(30,30)];
I would like to get code1 = A000 in the above example, this can be of length 10 chars long, but don't have to be as you can see. Same, thing goes for the other 2 columns, code2, and text. How can I do this?
Upvotes: 0
Views: 513
Reputation: 4553
If I understand correctly, you have an input NSString
str
which consists of three concatenated strings: col1
, col2
, and col3
. Additionally, you know the following constraints about the problem
col1
is between 0 and 10 characterscol2
is between 0 and 20 characterscol3
is between 0 and 30 charactersand want to recover these strings from str
. Put differently, you want to uniquely determine col1
, col2
, and col3
so that str
is equal to
[NSString stringWithFormat:@"%@%@%@", col1, col2, col3];
Unfortunately, as others have commented, this is not possible without modifying the problem. To see why not, consider the case where
str = @"a";
In this case, you know that one of the component strings (col1
, col2
, or col3
) is equal to @"a"
and the other two are equal to @""
. However, it's not possible to determine which. If, for example col1 = @"a"
and col2
and col3
are both equal to @""
; then
[NSString stringWithFormat:@"%@%@%@", col1, col2, col3]
evaluates to
@"a"
as desired. However this is also true if col1
and col2
are equal to @""
and col3 = @"a"
since
[NSString stringWithFormat:@"%@%@%@", col1, col2, col3]
still evaluates to
@"a"
The problem here is not that the component strings are able to be empty but rather that they're able to vary over a range.
If we constrained the problem so that the lengths were exact
col1
, which is 10 characters longcol2
, which is 20 characters longcol3
, which is 30 characters longit would then be possible to recover str
with the following function:
void GetColumnsFromString(NSString *str, NSString * __autoreleasing *col1, NSString * __autoreleasing *col2, NSString * __autoreleasing *col3)
{
if (col1) {
*col1 = [str substringWithRange:NSMakeRange(0, 10)];
}
if (col2) {
*col2 = [str substringWithRange:NSMakeRange(10, 20)];
}
if (col3) {
*col3 = [str substringWithRange:NSMakeRange(30, 30)];
}
}
Another, better, solution, as has been mentioned in the comments, is to use "special" characters in str
to demarcate the boundary between the component strings. If we constructed str
like this
str = [NSString stringWithFormat:@"%@%@%@", col1, col2, col3];
and we constrained col1
and col2
and col3
not to contain the character
, then we could parse col1
and col2
as follows:
NSArray *cols = [str componentsSeparatedByString:@""];
col1 = cols[0];
col2 = cols[1];
col3 = cols[2];
The situation is no different if instead of the
character you use the space character.
Edit: You added more information about the input string and the desired output:
Rather than three, there are four component strings: col1
, col2
, col3
, and col4
. We have some information about them:
col1
is between 0 and 10 characters longcol1
does not contain the space charactercol2
is between 0 and 20 characters longcol2
does not contain the space charactercol3
is between 0 and 30 characters longcol3
MAY contain the space charactercol4
isn't constrained in lengthcol4
MAY contain the space characterAdditionally, the four strings are separated by spaces in their concatenation. So your goal is to uniquely determine col1
, col2
, col3
, and col4
so str
is equal to
[NSString stringWithFormat:@"%@ %@ %@ %@", col1, col2, col3, col4];
You can use an NSScanner
to extract col1
and col2
in this case:
NSScanner *scanner = [NSScanner scannerWithString:str];
NSCharacterSet *spaceCharacterSet = [NSCharacterSet characterSetWithCharactersInString:@" "];
NSString *col1 = nil, *col2 = nil;
[scanner scanUpToCharactersFromSet:spaceCharacterSet intoString:&col1];
[scanner scanUpToCharactersFromSet:spaceCharacterSet intoString:&col2];
At this point, it's possible to extract the string remainder
which contains the two final strings col3
and col4
separated by a space:
NSCharacterSet *emptyCharacterSet = [NSCharacterSet characterSetWithCharactersInString:@""];
NSString *remainder = nil;
[scanner scanUpToCharactersFromSet:emptyCharacterSet intoString:&remainder];
At this point, you are back in the same sort of situation I described at the beginning. You have a string (remainder
) which consists of two component strings (col3
and col4
) which are separated by a space. The only way to detect the border between these two strings is that space.
However, col3
may contain spaces. If it could not, then you could simply scan along until the next space was reached and extract the contents between the beginning and that space into col3
and the rest into col4
.
In addition, col4
may also contain spaces. If it could not, then you could scan from the end of remainder
until the first space from the end was reached, extract that range into col4
and the rest into col3
.
Upvotes: 1