Reputation: 34152
This is very odd as I have used the Replace function for thousands of times. This is my code:
while (d.IndexOf("--") != -1) d=d.Replace("--", "-");
and this is the variable d's value when I trace:
"آدنیس,اسم دختر,girl name,آدونیس---گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود"
but it stuck when the value of d is:
"آدنیس,اسم دختر,girl name,آدونیس--گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود"
can anybody tell me why? Its funny that even dashes are added programatically.
Upvotes: 9
Views: 946
Reputation: 276209
That is because this:
var d1 = "آدنیس,اسم دختر,girl name,آدونیس---گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود";
is not the same as this:
var d2 = "آدنیس,اسم دختر,girl name,آدونیس---گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود";
The last three characters in your string are not actually the unicode -
Try it yourself:
var d1 = "آدنیس,اسم دختر,girl name,آدونیس---گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود";
var d2 = "آدنیس,اسم دختر,girl name,آدونیس---گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود";
while (d.IndexOf("--", StringComparison.Ordinal) != -1) d1 = d1.Replace("--", "-");
Console.WriteLine(d1); // the last characters are left
while (d2.IndexOf("--", StringComparison.Ordinal) != -1) d2 = d2.Replace("--", "-");
Console.WriteLine(d2); // All clear
Just FYI: String comparison method indexof is culture specific. I would use:
var d = "آدنیس,اسم دختر,girl name,آدونیس---گلی-به-رنگ-زرد-و-قرمز-که-فقط-هنگام-تابش-خورشید-باز-می-شود";
while (d.IndexOf("--", System.StringComparison.Ordinal) != -1)
d = d.Replace("--", "-");
Since it uses ordinal rules i.e. culture independent unicode values, and it runs faster.
Upvotes: 18
Reputation: 66882
I've tested this with LinqPad - interesting.
// d0 succeeds:
var d0 = "world--life";
while (d0.IndexOf("--") != -1)
{
d0=d0.Replace("--", "-");
d0.Dump();
}
// d1 loops forever
var d1 = "world--life";
while (d1.IndexOf("--") != -1)
{
d1=d1.Replace("--", "-");
d1.Dump();
}
The difference between the two loops is that while they may appear identical, the second loop actually uses different Unicode characters for the hyphens in IndexOf
to the ones in Replace
Looking at the MSDN docs:
IndexOf - http://msdn.microsoft.com/en-us/library/k8b1470s.aspx - This method performs a word (case-sensitive and culture-sensitive) search using the current culture. The search begins at the first character position of this instance and continues until the last character position.
Replace - http://msdn.microsoft.com/en-us/library/fk49wtc1.aspx - This method performs an ordinal (case-sensitive and culture-insensitive) search to find oldValue.
So the difference is culture-insensitive versus culture-sensitive
Upvotes: 3
Reputation: 263693
You can use Regex.Replace()
string _txt = "----------";
_txt = Regex.Replace(_txt, @"\-{2,}", "-");
this will output: -
Upvotes: 4