RunLoop
RunLoop

Reputation: 20376

Convert emoji unicode to hex codepoint

I am trying to obtain the hex codepoint for emojis.

The code below successfully returns the hex codepoint for emojis without surrogate pairs (e.g. 1f58d for 🖍️):

NSData *data = [@"🖍️" dataUsingEncoding:NSUTF32LittleEndianStringEncoding];
uint32_t unicode;
[data getBytes:&unicode length:sizeof(unicode)];
NSLog(@"%x", unicode);

However, for emojis like "🤲🏾" which has codepoint "1f932-1f3ff", the method above only returns the first point, "1f932". How can I get the full hex codepoint for emojis with multiple code points please (any code approach is fine)? (Note that certain emojis, like "🚣‍♀️" has up to 5 code points e.g. 🚣‍♀️)

Upvotes: 0

Views: 411

Answers (2)

BenW
BenW

Reputation: 331

- (NSArray<NSNumber*>*) unicodeCodePoints:(NSString*)unicodeChar
{
    NSMutableArray* codePoints = [[NSMutableArray alloc] init];

    NSData* data = [unicodeChar dataUsingEncoding:NSUTF32LittleEndianStringEncoding];

    for ( NSUInteger i = 0; i < data.length / sizeof(UInt32); i++ )
    {
        UInt32* arr = (UInt32*)(data.bytes);

        [codePoints addObject:@(arr[i])];
    }

    return codePoints;
}

Then you could call it like this:

for ( NSNumber* num in [self unicodeCodePoints:@"🚣‍♀️"] )
{
    NSLog(@"%0*x", (int)(2*sizeof(UInt32)), (UInt32)[num unsignedIntegerValue]);
}

Please note this assumes a single unicode character is represented by the NSString argument.

Upvotes: 1

Hamid Yusifli
Hamid Yusifli

Reputation: 10137

You need to change uint32_t to uint64_t.

NSData *data = [@"🤲🏾" dataUsingEncoding:NSUTF32LittleEndianStringEncoding];
uint64_t unicode;
[data getBytes:&unicode length:sizeof(unicode)];
NSLog(@"%llx", unicode);

Upvotes: 2

Related Questions