Marked One
Marked One

Reputation: 364

String that contains English and Hebrew letters gets messed up after String.Join() - C# .NET

I have a string containing both English and Hebrew characters:
"Hitachi - היטצ'י:Hitachi – cartel CRT"

1st Step: flip the two parts that are separated by :.
Expected result: "Hitachi – cartel CRT:Hitachi - היטצ'י"

Next: I would like to concatenate the following text: ":אגם:עץ תיוק"
Final expected result: "Hitachi - cartel CRT:Hitachi - אגם:עץ תיוק:היטצ'י"

Actual result: "Hitachi – cartel CRT:Hitachi - היטצ'י:אגם:עץ תיוק"

This is my current code:

string path = "Hitachi - היטצ'י:Hitachi – cartel CRT";
string[] splittedByColonPath = path.Split(':');
Array.Reverse(splittedByColonPath);
List<string> list = new List<string>(splittedByColonPath);
list.Add("אגם:עץ תיוק:");            
string result = String.Join(":", list.ToArray());

Any ideas on how to rearrange it the proper way?

Upvotes: 0

Views: 895

Answers (1)

psmears
psmears

Reputation: 27990

The String.Join is working just fine, and the string is exactly what you want it to be. (You can test this if you like by writing some code to printthe string one character at a time, one character on each line.) The trouble is that, when displaying it, all the Hebrew text and colons is treated as one phrase, and since Hebrew is primarily right-to-left that means the first word in the phrase appears on the right.

Depending on what you want to achieve, this may be fine (eg if you're passing it to another program that expects data separeated by colons - in that case, the string may look wrong, but the other program will interpret it just fine). But if you want it to look how you're expecting, you have to force the display algorithm to treat the colons as left-to-right. You may be able to do this by changing the code to be

string result = String.Join("\u200e:"), list.ToArray());

The \u200e is a left-to-right marker (LRM), which causes any adjacent punctuation to be treated as left-to-right.

The downside of doing this is that any other program interpreting the data may not expect the LRM and may be confused by it.

Upvotes: 3

Related Questions