Thomas
Thomas

Reputation: 12087

How to optimize this code for speed, in F# and also why is a part executed twice?

The code is used to pack historical financial data in 16 bytes:

type PackedCandle =
    struct
        val H: single
        val L: single
        val C: single
        val V: int
    end
    new(h: single, l: single, c: single, v: int) = { H = h; L = l; C = c; V = v }
    member this.ToByteArray =
        let a = Array.create 16 (byte 0)
        let h = BitConverter.GetBytes(this.H)
        let l = BitConverter.GetBytes(this.L)
        let c = BitConverter.GetBytes(this.C)
        let v = BitConverter.GetBytes(this.V)
        a.[00] <- h.[0]; a.[01] <- h.[1]; a.[02] <- h.[2]; a.[03] <- h.[3]
        a.[04] <- l.[0]; a.[05] <- l.[1]; a.[06] <- l.[2]; a.[07] <- l.[3]
        a.[08] <- c.[0]; a.[09] <- c.[1]; a.[10] <- c.[2]; a.[11] <- c.[3]
        a.[12] <- v.[0]; a.[13] <- v.[1]; a.[14] <- v.[2]; a.[15] <- v.[3]
        printfn "!!" <- for the second part of the question
        a

Arrays of these are sent across the network, so I need the data to be as small as possible, but since this is tracking about 80 tradable instruments at the same time, performance matters as well. A tradeoff was made where clients are not getting historical data and then updates, but just getting chunks of the last 3 days minute by minute, resulting in the same data being sent over and over to simplify the client logic.. and I inherit the problem of making the inefficient design.. as efficient as possible. This is also done over rest polling which I'm converting to sockets right now to keep everything binary.

So my first question is: how can I make this faster? in C where you can cast anything into anything, I can just take a float and write it straight into the array so there is nothing faster, but in F# it looks like I need to jump through hoops, getting the bytes and then copying them one by one instead of 4 by 4, etc. Is there a better way?

My second question is that since this was to be evaluated once, I made ToByteArray a property. I'm doing some test with random values in Jupyter Notebook but then I see that:

enter image description here

the property seems to be executed twice (indicated by the two "!!" lines). Why is that?

Upvotes: 3

Views: 154

Answers (2)

JL0PD
JL0PD

Reputation: 4488

Assuming you have array to write to (generally you should use buffer for reading & writing when working with sockets), you can use System.Runtime.CompilerServices.Unsafe.As<TFrom, TTo> to cast memory from one type to another (same thing that you can do with C/C++)

type PackedCandle =
    // omitting fields & consructor
    override c.ToString() = $"%f{c.H} %f{c.L} %f{c.C} %d{c.V}" // debug purpose

    static member ReadFrom(array: byte[], offset) =
        // get managed(!) pointer
        // cast pointer to another type
        // same as *(PackedCandle*)(&array[offset]) but safe from GC
        Unsafe.As<byte, PackedCandle> &array.[offset]

    member c.WriteTo(array: byte[], offset: int) =
        Unsafe.As<byte, PackedCandle> &array.[offset] <- c

Usage

let byteArray = Array.zeroCreate<byte> 100 // assume array come from different function

// writing
let mutable offset = 0
for i = 0 to 5 do
    let candle = PackedCandle(float32 i, float32 i, float32 i, i)
    candle.WriteTo(byteArray, offset)
    offset <- offset + Unsafe.SizeOf<PackedCandle>() // "increment pointer"

// reading
let mutable offset = 0
for i = 0 to 5 do
    let candle = PackedCandle.ReadFrom(byteArray, offset)
    printfn "%O" candle
    offset <- offset + Unsafe.SizeOf<PackedCandle>()

But do you really want to mess with pointers (even managed)? Have measured that this code is bottleneck?

Update

It's better to use MemoryMarshal instead of raw Unsafe because first checks out-of-range and enforces usage of unmanaged (see here or here) types at runtime

member c.WriteTo (array: byte[], offset: int) =
    MemoryMarshal.Write(array.AsSpan(offset), &Unsafe.AsRef(&c))

static member ReadFrom (array: byte[], offset: int) =
    MemoryMarshal.Read<PackedCandle>(ReadOnlySpan(array).Slice(offset))

Upvotes: 5

Tomas Petricek
Tomas Petricek

Reputation: 243041

My first question would be, why do you need the ToByteArray operation? In the comments, you say that you are sending arrays of these values over network, so I assume you plan to convert the data to a byte array so that you can write it to network stream.

I think it would be more efficient (and easier) to instead have a method that takes a StreamWriter and writes the data to the stream directly:

type PackedCandle =
  struct
      val H: single
      val L: single
      val C: single
      val V: int
  end
  new(h: single, l: single, c: single, v: int) = { H = h; L = l; C = c; V = v }
  member this.WriteTo(sw:StreamWriter) =
      sw.Write(this.H)
      sw.Write(this.L)
      sw.Write(this.C)
      sw.Write(this.V)

If you now have some code for the network communication, that will expose a stream and you'll need to write to that stream. Assuming this is stream, you can do just:

use writer = new StreamWriter(stream)
for a in packedCandles do a.WriteTo(writer)

Regarding your second question, I think this cannot be answered without a more complete code sample.

Upvotes: 4

Related Questions