Reputation: 4305
I'm working on putting together an algorithm for searching through an array to find the index of a given value. I'm trying to follow a similar technique found in the .NET Core CLR.
Where I am confused is the value for the idx
that is returned by the call to BitOperations.TrailingZeroCount
. When I search for 2
it should return an index value of 1
for the data that I am searching. What I get is a 4
. Do I need to do some conversion from the byte offset back to whatever the actual type is? Is this the most efficient way to write this? There are not many docs on using intrinsics with .NET.
open System
open System.Runtime.Intrinsics.X86
open System.Runtime.Intrinsics
open System.Numerics
#nowarn "9"
#nowarn "20"
[<EntryPoint>]
let main argv =
let searchSpace : Span<int32> = Span [| 1 .. 8 |]
let pSearchSpace = && (searchSpace.GetPinnableReference ())
let search = Sse2.LoadVector128 (pSearchSpace)
let value = 2
let values = Vector128.Create value
let comparison = Sse2.CompareEqual (values, search)
let matches = Sse2.MoveMask (comparison.AsByte())
if matches > 0 then
let idx = BitOperations.TrailingZeroCount matches
printfn "%A" idx // <--- This prints 4 instead of 1
0 // return an integer exit code
Upvotes: 0
Views: 356
Reputation: 365567
Sse2.MoveMask (comparison.AsByte())
(i.e. vpmovmskb
) gives you 4 mask bits per int32 match result, one from each Byte like you asked for.
Either use vmovmskps
(perhaps MoveMask(cmp.AsFloat())
, if that's how F# does it?) or deal with the byte instead of element indices you get from bsf
/ tzcnt
.
Upvotes: 2