Reputation: 938
I am getting very confused having this error. Below are the 2 strings I am comparing and they look exactly same in the open eyes. But when I tried to compare them in C# Code OR MS Excel, the result is "Mismatch".
1st: Frillestads_församling_Länsräkenskaper efter 1917. Mantalslängder 1918-1991 Special 99
2nd: Frillestads_församling_Länsräkenskaper efter 1917. Mantalslängder 1918-1991 Special 99
Even when I tried to split them in a string array using single space (' '), the 1st line wasn't splitted.
Here is the C# code:
private void btnFindMismatch_Click(object sender, EventArgs e)
{
string value1 = FormattedString(txtFirstValue.Text);
string value2 = FormattedString(txtSecondValue.Text);
bool isMismatchFound = false;
string[] value1Array = value1.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
string[] value2Array = value2.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries);
for (int i = 0; i < value1Array.Length; i++)
{
if(value1Array[i].Equals(value2Array[i]) == false)
{
lblResult.Text = string.Format("Mismatch in index: {0}; 1st Char: {1}; 2nd Char: {2}", i, value1Array[i], value2Array[i]);
isMismatchFound = true;
break;
}
}
if(!isMismatchFound)
{
lblResult.Text = "No Mismatch";
}
MessageBox.Show("Complete");
}
private string FormattedString(string value)
{
RegexOptions options = RegexOptions.None;
Regex regex = new Regex(@"[ ]{2,}", options);
value = regex.Replace(value, @" ");
return value;
}
I then tried to check the 1st value in notepad++ and then found that, the 1st string did not contain any "White Space".
Please see below screen shots for more clearer view.
Upvotes: 1
Views: 1924
Reputation: 336108
It appears that those aren't normal spaces (0x20
) but perhaps non-breakable spaces (0xA0
). If you use the universal whitespace shorthand \s
instead of a standard space character, it should work.
Regex regex = new Regex(@"\s{2,}", options); // for example
Note that \s
will also match newlines, tabs and other whitespace - so perhaps you want to make the regex more specific, depending on which space character is actually being used (Notepad++ probably has a hexadecimal mode that will allow you to check which one it is exactly):
Regex regex = new Regex(@"[ \xa0]{2,}", options);
Upvotes: 2