andy250
andy250

Reputation: 20414

Dotnet string searching ambiguity (netcore31 vs net47 vs roslyn)

I found out that in some cases string.IndexOf() returns -1 when searching for double line break \r\n\r\n when it is followed by 0xCC (204) byte sequence (could be also others, haven't checked). Below is a dotnetfiddle sample - it's possible to select compiler there. On netcore31 compiler the string is found always, on other compliers only when not followed by 0xCC sequence. Can anyone explain it?

https://dotnetfiddle.net/ZMW7tL

EDIT: Same happens when I put 0xCB or 0xCD after last \r\n.

Source code of the fiddle:

using System;
                
public class Program
{
    public static void Main()
    {
        var x = new byte[] { 99, 108, 111, 115, 101, 13, 10, 13, 10, 204, 159, 67, 4 };
        var z = System.Text.Encoding.UTF8.GetString(x);
        Console.WriteLine(z);
        Console.WriteLine();
        var idx = z.IndexOf("\r\n\r\n");
        Console.WriteLine("index = " + idx);
        Console.WriteLine("=========================");
 
        var x1 = new byte[] { 99, 108, 111, 115, 101, 13, 10, 13, 10, 5, 159, 67, 4 };
        var z1 = System.Text.Encoding.UTF8.GetString(x1);
        Console.WriteLine(z1);
        Console.WriteLine();
        var idx1 = z1.IndexOf("\r\n\r\n");
        Console.WriteLine("index modified = " + idx1);
        Console.WriteLine("=========================");
 
        var x2 = new byte[] { 13, 10, 13, 10, 204, 159 };
        var z2 = System.Text.Encoding.UTF8.GetString(x2);
        Console.WriteLine(z2);
        Console.WriteLine();
        var idx2 = z2.IndexOf("\r\n\r\n");
        Console.WriteLine("index short = " + idx2);
    }
}

Upvotes: 0

Views: 66

Answers (0)

Related Questions