Reputation: 6204
I have the following simple program to generate a random Unicode string from the union of 3 unicode character-sets.
#!/usr/bin/env rdmd
import std.uni;
import std.random : randomSample;
import std.stdio;
import std.conv;
/**
* Random salt generator
*/
dstring get_salt(uint s)
{
auto unicodechars = unicode("Cyrillic") | unicode("Armenian") | unicode("Telugu");
dstring unichars = to!dstring(unicodechars);
return to!dstring(randomSample(unichars, s));
}
void main()
{
writeln("Random salt:");
writeln(get_salt(32));
}
However, the output of the writeln is:
$ ./teste.d
Random salt:
rw13 13437 78580112 104 3914645
What are these numbers? Unicode code-points? How do I print the actual characters? I am on Ubuntu Linux with Locale set to UTF-8
Upvotes: 2
Views: 297
Reputation: 25605
This line is the problem you have:
dstring unichars = to!dstring(unicodechars);
It converts the CodepointSet
object unicode
returns to string, not the characters it covers. The set has a name and boundaries of characters but not characters itself. It took this:
InversionList!(GcPolicy)(CowArray!(GcPolicy)([1024, 1157, 1159, 1320, 1329, 1367, 1369, 1376, 1377, 1416, 1418, 1419, 1423, 1424, 3073, 3076, 3077, 3085, 3086, 3089, 3090, 3113, 3114, 3124, 3125, 3130, 3133, 3141, 3142, 3145, 3146, 3150, 3157, 3159, 3160, 3162, 3168, 3172, 3174, 3184, 3192, 3200, 7467, 7468, 7544, 7545, 11744, 11776, 42560, 42648, 42655, 42656, 64275, 64280, 5]))
And pulled random chars out of that string! Instead, you want:
dstring unichars = to!dstring(unicodechars.byCodepoint);
Calling the byCodepoint
method on that object will yield the actual characters (well, code points, unicode is messy) inside the range, then you get a string out of that and randomize it.
Upvotes: 5