Reputation:
Is there a way to remove the array bounds check in C#
?
here is what I want to achieve:
public static int F(int[] M, int i)
{
return M[i]; // I can guarantee that [i] will never be outside of [0, M.Length]
}
Before this function call I have a logic which already does check for the bounds (with some extra logic into it). The thing I want to remove are following lines:
Program.F(Int32[], Int32)
L0000: sub rsp, 0x28
L0004: cmp edx, [rcx+8] ; I don't need this line
L0007: jae short L0015 ; I don't need this line
L0009: movsxd rax, edx
L000c: mov eax, [rcx+rax*4+0x10]
L0010: add rsp, 0x28
L0014: ret
L0015: call 0x00007ffc8877bc70 ; I don't need this line
L001a: int3 ; I don't need this line
Is there a way of removing those instructions?
public static int G(int[] M, int i)
{
if (i >= 0 && i < M.Length)
return M[i];
return -1;
}
this generates:
Program.G(Int32[], Int32)
L0000: sub rsp, 0x28
L0004: test edx, edx
L0006: jl short L001f
L0008: mov eax, [rcx+8]
L000b: cmp eax, edx
L000d: jle short L001f
L000f: cmp edx, eax
L0011: jae short L0029
L0013: movsxd rax, edx
L0016: mov eax, [rcx+rax*4+0x10]
L001a: add rsp, 0x28
L001e: ret
L001f: mov eax, 0xffffffff
L0024: add rsp, 0x28
L0028: ret
L0029: call 0x00007ffc8877bc70
L002e: int3
as you can see it didn't help.
unsafe
:public static unsafe int H(int* M, int i)
{
return M[i];
}
this generates what I was looking for:
Program.H(Int32*, Int32)
L0000: movsxd rax, edx
L0003: mov eax, [rcx+rax*4]
L0006: ret
But I sadly can't enable unsafe for my project. Is there a solution in "non-unsafe" world?
Upvotes: 10
Views: 4289
Reputation: 6142
Just checked these two patterns on .NET 8.0:
int G1(int[] M, int i)
{
return (uint)i < M.Length ? M[i] : -1;
}
int G2(int[] M, int i)
{
if (i >= 0 && i < M.Length)
return M[i];
return -1;
}
and the codegen doesn't contain bound checks:
; Method Foo:G1(int[],int):int:this (FullOpts)
mov eax, dword ptr [rdx+08H]
cmp eax, r8d
ja SHORT G_M59182_IG05
mov eax, -1
ret
G_M59182_IG05:
mov eax, r8d
mov eax, dword ptr [rdx+4*rax+10H]
ret
; Total bytes of code: 22
; Method Foo:G2(int[],int):int:this (FullOpts)
test r8d, r8d
jl SHORT G_M35821_IG05
mov eax, dword ptr [rdx+08H]
cmp eax, r8d
jle SHORT G_M35821_IG05
mov eax, r8d
mov eax, dword ptr [rdx+4*rax+10H]
ret
G_M35821_IG05:
mov eax, -1
ret
; Total bytes of code: 27
Upvotes: 2
Reputation: 10463
According to this blog:
https://devblogs.microsoft.com/dotnet/performance_improvements_in_net_7/#bounds-check-elimination
PGO feature in .net 7 greatly optimises boundary checks. However I doubt anyone can figure how would it work in your case in theory because I think its non-deterministic.
I would set the following params and then benchmark with and without:
DOTNET_TieredPGO=1
DOTNET_ReadyToRun=0
Upvotes: 3
Reputation: 4498
Actually there's the way. Stumbled upon it in csFastFloat repository.
Idea here is to use MemoryMarshall.GetArrayDataReference to get reference to first item in array and then add shift to get actual value:
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static T FastAccessValue<T>(T[] ar, int index)
{
ref T tableRef = ref MemoryMarshal.GetArrayDataReference(ar);
return Unsafe.Add(ref tableRef, (nint)index);
}
which is safe(?) equivalent of unsafe version
[MethodImpl(MethodImplOptions.AggressiveInlining)]
static unsafe T FastAccessValueUnsafe<T>(T[] ar, int index) where T : unmanaged
{
fixed(T* ptr = ar)
{
return ptr[index];
}
}
without limitation to only unmanaged
structs.
With unsafe access it's even performs 10% faster on large data (over million items)
public int SumUnsafe(int[] ints, int length)
{
int sum = 0;
for (int i = 0; i < length; i++)
{
sum += FastAccessValue(ints, i);
}
return sum;
}
public int SumDirect(int[] ints, int length)
{
int sum = 0;
for (int i = 0; i < ints.Length; i++)
{
sum += ints[i];
}
return sum;
}
Method | ints | length | Mean | Error | StdDev | Code Size |
---|---|---|---|---|---|---|
SumDirect | Int32[100000] | 100000 | 80.13 μs | 0.748 μs | 0.700 μs | 29 B |
SumUnsafe | Int32[100000] | 100000 | 81.99 μs | 0.535 μs | 0.446 μs | 33 B |
SumDirect | Int32[1000000] | 1000000 | 854.73 μs | 5.216 μs | 4.624 μs | 29 B |
SumUnsafe | Int32[1000000] | 1000000 | 795.10 μs | 2.680 μs | 2.238 μs | 33 B |
SumDirect | Int32[10000000] | 10000000 | 10,104.72 μs | 27.199 μs | 22.712 μs | 29 B |
SumUnsafe | Int32[10000000] | 10000000 | 9,126.06 μs | 30.329 μs | 26.886 μs | 33 B |
Benchmark located in this gist
Upvotes: 9