Mohit
Mohit

Reputation: 1305

Convert u16string to float

I have a utf16 encoded string, I want to convert it to float

For Example
If have a utf16 string like u"1342.223" it should return 1342.223 in floats, if it was utf8 i used to convert it using stod function, but how to do this job on utf16 enocoded string std::u16string

Upvotes: 2

Views: 888

Answers (3)

Galik
Galik

Reputation: 48635

There is no standard function for this. If you can use std::wstring on a system that happens to use 16bit wide characters, you could use:

double d;
std::wistringstream(L"1342.223") >> d;

Otherwise you could take advantage of the simple conversion of numeric digits from UTF-16 to ASCII/UTF-8 to write a fast conversion function. It is not ideal but should be reasonably efficient:

double u16stod(std::u16string const& u16s)
{
    char buf[std::numeric_limits<double>::max_digits10 + 1];

    std::transform(std::begin(u16s), std::end(u16s), buf,
        [](char16_t c){ return char(c); });

    buf[u16s.size()] = '\0'; // terminator

    // some error checking here?
    return std::strtod(buf, NULL);
}

Upvotes: 1

Serge Ballesta
Serge Ballesta

Reputation: 149085

First, conversion of an utf16 numeric character string to a narrow character string is trivial. Even if you cannot be sure that the narrow character set is ASCII for 7 bits characters, C guarantees that code '0' to '9' shall be consecutive, and it is also true for Unicode (0x30 to 0x39). So code can be as simple as (only depends on <string> inclusion:

double u16strtod(const std::u16string& u16) {
    char *beg = new char[u16.size() + 1];
    char *str = beg;
    for (char16_t uc: u16) {
        if (uc == u' ') *str++ = ' ';     // special processing for possible . and space
        else if (uc == u'.') *str++ = '.';
        else if ((uc < u'0') || (uc > u'9')) break;  // could use better error processing
        else {
            *str++ = '0' + (uc - u'0');
        }
    }
    *str++ = '\0';
    char *end;
    double d = strtod(beg, &end);   // could use better error processing
    delete[] beg;
    return d;
}    

It is even simpler if narrow charset is ASCII:

double u16strtod(const std::u16string& u16) {
    char *beg = new char[u16.size() + 1];
    char *str = beg;
    for (char16_t uc: u16) {
        if ((uc <= 0) || (uc >= 127)) break;  // can only contain ASCII characters
        else {
            *str++ = uc;      // and the unicode code IS the ASCII code
        }
    }
    *str++ = '\0';
    char *end;
    double d = strtod(beg, &end);
    delete[] beg;
    return d;
}

Upvotes: 1

einpoklum
einpoklum

Reputation: 131976

If you know for a fact that your string is nicely-formatted (e.g. no spaces), and if and only if performance is critical (i.e. if you're parsing millions or billions of numbers), don't dismiss the possibility of just decoding it yourself, looping over the string. Look for the standard library source code (perhaps compare libc++ and libstdc++) to see what they do, and adapt it. Of course, in these cases, you should also take care to parallelize your work, try to exploit SIMD and so on.

Upvotes: 0

Related Questions