Viet NT
Viet NT

Reputation: 305

Type-abbreviation with compile-time check with zero overhead

Is it possible ? and does my little hack work ?

[<StructLayout(LayoutKind.Explicit,Size=8)>]
type Slice = 
    struct
        [<FieldOffset(0)>]val mutable Address: int64
    end

let incSlice (slice: Slice) = slice.Address + 1L

let incInt64 (address: int64) = address + 1L

Are incSlice and incInt64 compiled to the same assembly code ?

Timing nearly identical but I'm not 100% sure

Upvotes: 3

Views: 145

Answers (3)

Ganesh Sittampalam
Ganesh Sittampalam

Reputation: 29100

The trouble with trivial benchmarks is that the optimiser always causes confusion. However I tried a simple test by building an exe in Release mode and looking at the disassembly in the debugger.

It looks like Slice does still have some more address calculation work than the int64 version:

        let v1 = incSlice sl1
00000066  lea         edi,[ebp-1Ch] 
00000069  lea         esi,[ebp-14h] 
0000006c  movq        xmm0,mmword ptr [esi] 
00000070  movq        mmword ptr [edi],xmm0 
00000074  mov         eax,dword ptr [ebp-1Ch] 
00000077  mov         edx,dword ptr [ebp-18h] 
0000007a  mov         ecx,1 
0000007f  xor         ebx,ebx 
00000081  add         eax,ecx 
00000083  adc         edx,ebx 
00000085  mov         dword ptr [ebp-24h],eax 
00000088  mov         dword ptr [ebp-20h],edx 

        let v2 = incInt64 n
0000008b  mov         eax,dword ptr [ebp+8] 
0000008e  mov         edx,dword ptr [ebp+0Ch] 
00000091  mov         ecx,1 
00000096  xor         ebx,ebx 
00000098  add         eax,ecx 
0000009a  adc         edx,ebx 
0000009c  mov         dword ptr [ebp-2Ch],eax 
0000009f  mov         dword ptr [ebp-28h],edx 

In another test I worked directly with a literal value, and found that the int64 version got optimised to the result whereas the Slice version didn't - i.e. its existence seems to defeat the optimiser in more cases.

EDIT: Note this is 32-bit. 64-bit generally looks nicer as per Viet NT's answer.

Also it turns out this only happens if you run the program under the debugger to begin with. If you attach the debugger after the methods have run then the code looks very similar and the variant with literal values gets constant folded for both the int64 and Slice cases.

Upvotes: 3

John Palmer
John Palmer

Reputation: 25516

The problem here is trying to add compile time safety to a function without impacting the run-time performance. While the struct with only a single field may work, it can be hard to predict.

Instead, a better solution is Units of measure

You can define a measure type with

[<Measure>] type Address

then the function becomes

let slice (t:int64<Address>)= t+1L<Address>

The compiler will completely optimise away the measure type, while providing type safety.

Upvotes: 4

Viet NT
Viet NT

Reputation: 305

Seem very identical now

let test1(s: Slice) = 
    printfn "%d" 10
    let x = s.address + 1
    printfn "%d" x

produce

000000c1  mov         rax,qword ptr [rsp+000000A0h] 
000000c9  inc         rax 
000000cc  mov         qword ptr [rsp+28h],rax 

seem optimized

and with the help of inline

let inline incSlice(s: Slice) = s.address + 1

let test2 (s: Slice) =
    printfn "%d" 10
    let x = incSlice s

produce

000000be  mov         rax,qword ptr [rsp+000000A0h] 
000000c6  mov         qword ptr [rsp+30h],rax 
000000cb  mov         rax,qword ptr [rsp+30h] 
000000d0  inc         rax 
000000d3  mov         qword ptr [rsp+28h],rax 

Nearly identical with test1

Funny assembly code :D, rax moving around - the reason I don't think disassembly in the debugger is optimized

Upvotes: 1

Related Questions