fccoelho
fccoelho

Reputation: 6204

D-language: How to print Unicode characters to the console?

I have the following simple program to generate a random Unicode string from the union of 3 unicode character-sets.

#!/usr/bin/env rdmd
import std.uni;
import std.random : randomSample;
import std.stdio;
import std.conv;

/**
*  Random salt generator
*/
dstring get_salt(uint s)
{
    auto unicodechars = unicode("Cyrillic") | unicode("Armenian") | unicode("Telugu");
    dstring unichars =  to!dstring(unicodechars);

    return to!dstring(randomSample(unichars, s));
}

void main()
{
    writeln("Random salt:");
    writeln(get_salt(32));
}

However, the output of the writeln is:

$ ./teste.d
Random salt:
rw13  13437 78580112 104 3914645

What are these numbers? Unicode code-points? How do I print the actual characters? I am on Ubuntu Linux with Locale set to UTF-8

Upvotes: 2

Views: 297

Answers (1)

Adam D. Ruppe
Adam D. Ruppe

Reputation: 25605

This line is the problem you have:

dstring unichars =  to!dstring(unicodechars);

It converts the CodepointSet object unicode returns to string, not the characters it covers. The set has a name and boundaries of characters but not characters itself. It took this:

InversionList!(GcPolicy)(CowArray!(GcPolicy)([1024, 1157, 1159, 1320, 1329, 1367, 1369, 1376, 1377, 1416, 1418, 1419, 1423, 1424, 3073, 3076, 3077, 3085, 3086, 3089, 3090, 3113, 3114, 3124, 3125, 3130, 3133, 3141, 3142, 3145, 3146, 3150, 3157, 3159, 3160, 3162, 3168, 3172, 3174, 3184, 3192, 3200, 7467, 7468, 7544, 7545, 11744, 11776, 42560, 42648, 42655, 42656, 64275, 64280, 5]))

And pulled random chars out of that string! Instead, you want:

dstring unichars =  to!dstring(unicodechars.byCodepoint);

Calling the byCodepoint method on that object will yield the actual characters (well, code points, unicode is messy) inside the range, then you get a string out of that and randomize it.

Upvotes: 5

Related Questions