Reputation: 37689
I'm using an old objectiveC routine (let's call it oldObjectiveCFunction), which parses a String analyzing each char. After analyzing chars, it divides that String into Strings, and returns them into an array called *functions. This is a super reduced sample of how is that old function doing the String parse:
NSMutableArray *functions = [NSMutableArray new];
NSMutableArray *components = [NSMutableArray new];
NSMutableString *sb = [NSMutableString new];
char c;
int sourceLen = source.length;
int index = 0;
while (index < sourceLen) {
c = [source characterAtIndex:index];
//here do some random work analyzing the char
[sb appendString:[NSString stringWithFormat:@"%c",c]];
if (some condition){
[components addObject:(NSString *)sb];
sb = [NSMutableString new];
[functions addObject:[components copy]];
}
}
later, I'm getting each String of *functions doing this with Swift code:
let functions = oldObjectiveCFunction(string) as? [[String]]
functions?.forEach({ (function) in
var functionCopy = function.map { $0 }
for index in 0..<functionCopy.count {
let string = functionCopy[index]
}
}
the problem is that, it works perfectly with normal strings, but if the String contains russian names, like this:
РАЦИОН
the output, the content of my let string
variable, is this:
\u{10}&\u{18}\u{1e}\u{1d}
How can I get the same Russian string instead of that?
I tried doing this:
let string2 = String(describing: string?.cString(using: String.Encoding.utf8))
but it returns even more strange result:
"Optional([32, 16, 38, 24, 30, 29, 0])"
Upvotes: 0
Views: 193
Reputation: 30153
Analysis. Sorry, I don't speak swift or Objective-C so the following example is given in Python; however, the 4th and 5th column (unicode reduced to 8-bit) recalls weird numbers in your question.
for ch in 'РАЦИОН':
print(ch, # character itself
ord(ch), # character unicode in decimal
'{:04x}'.format(ord(ch)), # character unicode in hexadecimal
(ord(ch)&0xFF), # unicode reduced to 8-bit decimal
'{:02x}'.format(ord(ch)&0xFF)) # unicode reduced to 8-bit hexadecimal
Р 1056 0420 32 20 А 1040 0410 16 10 Ц 1062 0426 38 26 И 1048 0418 24 18 О 1054 041e 30 1e Н 1053 041d 29 1d
Solution. Hence, you need to fix all in your code reducing 16-bit to to 8-bit:
first, declare unichar c;
instead of at the 4th line, char c;
and use [sb appendString:[NSString stringWithFormat:@"%C",c]];
at the 11th line; note
%C
specifier 16-bit UTF-16 code unit (unichar) instead of%c
specifier 8-bit unsigned character (unsigned char)Resources. My answer is based on answers to the following questions at SO:
Upvotes: 1
Reputation: 2737
Your last result is not strange. The optional comes from the string?
, and the cString()
function returns an array of CChar ( Int8 ).
I think the problem comes from here - but I'm not sure because the whole thing looks confusing:
[sb appendString:[NSString stringWithFormat:@"%c",c]];
have you tried :
[sb appendString: [NSString stringWithCString:c encoding:NSUTF8StringEncoding]];
Instead of stringWithFormat?
( The solution of the %C instead of %c proposed by your commenters looks a good idea too. ) - oops - just saw you have tried without success.
Upvotes: 0