jaanq
jaanq

Reputation: 55

c# - fastest method get integer from part of bits

I have byte[] byteArray, usually byteArray.Length = 1-3

I need decompose an array into bits, take some bits (for example, 5-17), and convert these bits to Int32.

I tried to do this

private static IEnumerable<bool> GetBitsStartingFromLSB(byte b)
{
    for (int i = 0; i < 8; i++)
    {
        yield return (b % 2 != 0);
        b = (byte)(b >> 1);
    }
}
public static Int32 Bits2Int(ref byte[] source, int offset, int length)
{
    List<bool> bools = source.SelectMany(GetBitsStartingFromLSB).ToList();
    bools = bools.GetRange(offset, length);
    bools.AddRange(Enumerable.Repeat(false, 32-length).ToList() );
    int[] array = new int[1];
    (new BitArray(bools.ToArray())).CopyTo(array, 0);
    return array[0];           
}

But this method is too slow, and I have to call it very often.

How can I do this more efficiently?

Thanx a lot! Now i do this:

public static byte[] GetPartOfByteArray(  byte[] source, int offset, int length)
    {
        byte[] retBytes = new byte[length];
        Buffer.BlockCopy(source, offset, retBytes, 0, length);
        return retBytes;
    }
    public static Int32 Bits2Int(byte[] source, int offset, int length)
    {
        if (source.Length > 4)
        {
            source = GetPartOfByteArray(source, offset / 8, (source.Length - offset / 8 > 3 ? 4 : source.Length - offset / 8));
            offset -= 8 * (offset / 8);
        }
        byte[] intBytes = new byte[4];
        source.CopyTo(intBytes, 0);
        Int32 full = BitConverter.ToInt32(intBytes);
        Int32 mask = (1 << length) - 1;
        return (full >> offset) & mask;
    }

And it works very fast!

Upvotes: 4

Views: 615

Answers (2)

quetzalcoatl
quetzalcoatl

Reputation: 33516

First of all, you're asking for optimization. But the only things you've said are:

  • too slow
  • need to call it often

There's no information on:

  • how much slow is too slow? have you measured current code? have you estimated how fast you need it to be?
  • how often is "often"?
  • how large are the source byt arrays?
  • etc.

Optimization can be done in a multitude of ways. When asking for optimization, everything is important. For example, if source byte[] is 1 or 2 bytes long (yeah, may be ridiculous, but you didn't tell us), and if it rarely changes, then you could get very nice results by caching results. And so on.

So, no solutions from me, just a list of possible performance problems:

private static IEnumerable<bool> GetBitsStartingFromLSB(byte b) // A
{
    for (int i = 0; i < 8; i++)
    {
        yield return (b % 2 != 0); // A
        b = (byte)(b >> 1);
    }
}
public static Int32 Bits2Int(ref byte[] source, int offset, int length)
{
    List<bool> bools = source.SelectMany(GetBitsStartingFromLSB).ToList(); //A,B
    bools = bools.GetRange(offset, length); //B
    bools.AddRange(Enumerable.Repeat(false, 32-length).ToList() ); //C
    int[] array = new int[1]; //D
    (new BitArray(bools.ToArray())).CopyTo(array, 0); //D
    return array[0]; //D
}

A: LINQ is fun, but not fast unless done carefully. For each input byte, it takes 1 byte, splits that in 8 bools, passing them around wrapped it in a compiler-generated IEnumerable object *). Note that it all needs to be cleaned up later, too. Probably you'd get a better performance simply returning a new bool[8] or even BitArray(size=8).

*) conceptually. In fact yield-return is lazy, so it's not 8valueobj+1refobj, but just 1 enumerable that generates items. But then, you're doing .ToList() in (B), so me writing this in that way isn't that far from truth.

A2: the 8 is hardcoded. Once you drop that pretty IEnumerable and change it to a constant-sized array-like thing, you can preallocate that array and pass it via parameter to GetBitsStartingFromLSB to further reduce the amount of temporary objects created and later thrown away. And since SelectMany visits items one-by-one without ever going back, that preallocated array can be reused.

B: Converts whole Source array to stream of bytes, converts it to List. Then discards that whole list except for a small offset-length range of that list. Why covert to list at all? It's just another pack of objects wasted, and internal data is copied as well, since bool is a valuetype. You could have taken the range directly from IEnumerable by .Skip(X).Take(Y)

C: padding a list of bools to have 32 items. AddRange/Repeat is fun, but Repeat has to return an IEnumerable. It's again another object that is created and throw away. You're padding the list with false. Drop the list idea, make it an bool[32]. Or BitArray(32). They start with false automatically. That's the default value of a bool. Iterate over the those bits from 'range' A+B and write them into that array by index. Those written will have their value, those unwritten will stay false. Job done, no objects wasted.

C2: connect preallocating 32-item array with A+A2. GetBitsStartingFromLSB doesn't need to return anything, it may get a buffer to be filled via parameter. And that buffer doesn't need to be 8-item buffer. You may pass the whole 32-item final array, and pass an offset so that function knows exactly where to write. Even less objects wasted.

D: finally, all that work to return selected bits as an integer. new temporary array is created&wasted, new BitArray is effectively created&wasted too. Note that earlier you were already doing manual bit-shift conversion int->bits in GetBitsStartingFromLSB, why not just create a similar method that will do some shifts and do bits->int instead? If you knew the order of the bits, now you know them as well. No need for array&BitArray, some code wiggling, and you save on that allocations and data copying again.

I have no idea how much time/space/etc will that save for you, but that's just a few points that stand out at first glance, without modifying your original idea for the code too much, without doing-it-all via math&bitshifts in one go, etc. I've seen MarcGravell already wrote you some hints too. If you have time to spare, I suggest you try first mine, one by one, and see how (and if at all !) each change affects performance. Just to see. Then you'll probably scrap it all and try again new "do-it-all via math&bitshifts in one go" version with Marc's hints.

Upvotes: 1

Marc Gravell
Marc Gravell

Reputation: 1062780

If you're after "fast", then ultimately you need to do this with bit logic, not LINQ etc. I'm not going to write actual code, but you'd need to:

  • use your offset with / 8 and % 8 to find the starting byte and the bit-offset inside that byte
  • compose however many bytes you need - quite possibly up to 5 if you are after a 32-bit number (because of the possibility of an offset) ; for example into a long, in whichever endianness (presumably big-endian?) you expect
  • use right-shift (>>) on the composed value to drop however-many bits you need to apply the bit-offset (i.e. value >>= offset % 8;)
  • mask out any bits you don't want; for example value &= ~(-1L << length); (the -1 gives you all-ones; the << length creates length zeros at the right hand edge, and the ~ swaps all zeros for ones and ones for zeros, so you now have length ones at the right hand edge)
  • if the value is signed, you'll need to think about how you want negatives to be handled, especially if you aren't always reading 32 bits

Upvotes: 8

Related Questions