Ashkan Mobayen Khiabani
Ashkan Mobayen Khiabani

Reputation: 34152

String.Replace not working correctly

This is very odd as I have used the Replace function for thousands of times. This is my code:

while (d.IndexOf("--") != -1) d=d.Replace("--", "-");

and this is the variable d's value when I trace:

"آدنیس,اسم دختر,girl name,آدونیس--‌-گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود"

but it stuck when the value of d is:

"آدنیس,اسم دختر,girl name,آدونیس-‌-گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود"

can anybody tell me why? Its funny that even dashes are added programatically.

Upvotes: 9

Views: 946

Answers (3)

basarat
basarat

Reputation: 276209

That is because this:

var d1 = "آدنیس,اسم دختر,girl name,آدونیس--‌-گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود";

is not the same as this:

var d2 = "آدنیس,اسم دختر,girl name,آدونیس---گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود";

The last three characters in your string are not actually the unicode - Try it yourself:

var d1 = "آدنیس,اسم دختر,girl name,آدونیس--‌-گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود";
var d2 = "آدنیس,اسم دختر,girl name,آدونیس---گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود";
while (d.IndexOf("--", StringComparison.Ordinal) != -1) d1 = d1.Replace("--", "-");
Console.WriteLine(d1); // the last characters are left 
while (d2.IndexOf("--", StringComparison.Ordinal) != -1) d2 = d2.Replace("--", "-");
Console.WriteLine(d2); // All clear 

Just FYI: String comparison method indexof is culture specific. I would use:

var d = "آدنیس,اسم دختر,girl name,آدونیس--‌-گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود";
while (d.IndexOf("--", System.StringComparison.Ordinal) != -1) 
      d = d.Replace("--", "-");

Since it uses ordinal rules i.e. culture independent unicode values, and it runs faster.

Upvotes: 18

Stuart
Stuart

Reputation: 66882

I've tested this with LinqPad - interesting.

// d0 succeeds:
var d0 = "world--life";

while (d0.IndexOf("--") != -1) 
{
    d0=d0.Replace("--", "-");
    d0.Dump();
}

// d1 loops forever
var d1 = "world--life";

while (d1.IndexOf("--") != -1) 
{
    d1=d1.Replace("-‌-", "-");
    d1.Dump();
}

The difference between the two loops is that while they may appear identical, the second loop actually uses different Unicode characters for the hyphens in IndexOf to the ones in Replace

Looking at the MSDN docs:

So the difference is culture-insensitive versus culture-sensitive

Upvotes: 3

John Woo
John Woo

Reputation: 263693

You can use Regex.Replace()

string _txt = "----------";
_txt = Regex.Replace(_txt, @"\-{2,}", "-");

this will output: -

Upvotes: 4

Related Questions