gil kr
gil kr

Reputation: 2280

Why are String.IndexOf and String.Contains disagreeing when provided with Arabic text?

I want to know if I found a bug in the .NET Framework, or if I don't understand something. After running this piece of code:

var text = "مباركُ وبعض أكثر من نص";
var word = "مبارك";
bool exist = text.Contains(word);
int index = text.IndexOf(word);

The results are the "exists = true" and "index = -1"

How can it be?

Upvotes: 6

Views: 344

Answers (1)

Jon Skeet
Jon Skeet

Reputation: 1499800

Contains is culture-insensitive:

This method performs an ordinal (case-sensitive and culture-insensitive) comparison.

IndexOf is culture-sensitive:

This method performs a word (case-sensitive and culture-sensitive) search using the current culture.

That's the difference. If you use

int index = text.IndexOf(word, StringComparison.Ordinal);

then you'll get an index of 0 instead of -1 (so it's consistent with Contains).

There's no culture-sensitive overload of Contains; it's unclear to me whether you can use IndexOf reliably for this, but the CompareInfo class gives some more options. (I really don't know much about the details of cultural comparisons, particularly with RTL text. I just know it's complicated!)

Upvotes: 9

Related Questions