Reputation: 8610
Yesterday,in an Interview I have been asked to test the 5th bit in a number(to test whether its on and off)and below is how I done it.
int number = 16;
int mask = 1<<5;
if ((number & mask) == 0)
printf("Bit is off");
else
printf("its on");
but he then asked me to optimize this code and do it without using this particular mask.
So my questions what else mask I could have used in this code?
Upvotes: 6
Views: 1121
Reputation: 257
In one of the interviews, i gave the following answer, and he was satisfied, but a little change to the question was 'check if nth bit is set.
int N = 16;
printf ("%d\n", (N >> (n-1)) % 2);
So, when making the answer generic, Not exactly sure which one (of below) is faster for this example.
1<<(n-1) & N (or)
N>>(n-1) % 2 (or)
N>>(n-1) & 1
Upvotes: 0
Reputation: 4290
Maybe the interviewer wanted to see how you reacted to a simple challenge. Or simply wanted to know if you really understood C, and would stand your ground? Maybe the interviewer wanted to know if you knew non-zero is true, and hence test your depth of understanding of C? Or maybe whether you could do binary to hex in your head?
IMHO interviews are about a lot more than code. They are expensive to do. I always tried to get a clear impression of the person, something hard to do by written communication, or even on the phone. After all, some of those folks are going to work with the recruit!
The most compact, and possibly the quickest is probably:
int number = 16; // is this really what they gave?
printf((number & 0x20)?"its on":"Bit is off"); // did they mean 5th or bit 5?
Edit:
I've coded up the original approach, and my alternative, and compiled it for ARM Coretx-M3/4 (this is the processor I am writing for at the moment). It was compiled with -O3. Then I have disassembled each compiled file (using objdump) to get the assembler. I did it this way because the output of gcc -S
was huge; that includes a lot of extra information for the assembler and linker, which made it harder to find the code.
I replaced printf with a dummy_printf to avoid #including stdio.h which added more noise. The dummy_printf isn't exactly the same as printf, it just takes one parameter, but it keeps the output short :-)
The source (all 7 files appended to make it easier to read) are at: http://pastebin.com/PTeApu9n
The resulting concatenated output of objdump (for each .o) is at: http://pastebin.com/kHAmakE3
As you can see, the original is compiled to:
void original_bit5(int number) {
int mask = 1<<5;
if ((number & mask) == 0)
0: f010 0f20 tst.w r0, #32
4: d005 beq.n 1a <original_bit5+0x1a>
dummy_printf("Bit is off");
else
dummy_printf("its on");
6: f240 0000 movw r0, #0
a: f2c0 0000 movt r0, #0
e: f7ff bffe b.w 0 <dummy_printf>
void original_bit5(int number) {
int mask = 1<<5;
if ((number & mask) == 0)
dummy_printf("Bit is off");
12: f240 0000 movw r0, #0
16: f2c0 0000 movt r0, #0
1a: f7ff bffe b.w 0 <dummy_printf>
1e: bf00 nop
I think the call to dummy_printf is using tail-call chaining, i.e. dummy_printf will not return to this function. Very efficient!
There is no function entry code because the first four function parameters are passed in registers r0-r3.
You can't see the addresses of the two strings being loaded in r0. That is because this hasn't been linked.
You can see that:
int mask = 1<<5;
if ((number & mask) == 0)
is compiled to:
0: f010 0f20 tst.w r0, #32
4: d005 beq.n 1a <original_bit5+0x1a>
So 1<<5
and (... == 0)
, are compiler to a more direct and efficient sequence of instructions. There is a branch to the appropriate call of dummy_printf.
My code compiles to:
void my_bit5(int number) {
dummy_printf((number & 0x20)?"its on":"Bit is off");
0: f240 0200 movw r2, #0
4: f240 0300 movw r3, #0
8: f010 0f20 tst.w r0, #32
c: f2c0 0200 movt r2, #0
10: f2c0 0300 movt r3, #0
14: bf14 ite ne
16: 4610 movne r0, r2
18: 4618 moveq r0, r3
1a: f7ff bffe b.w 0 <dummy_printf>
1e: bf00 nop
This also seems to get tail-call optimised, i.e. there is no return from this function because there is no need of one, the return by dummy_printf will return directly to main()
What you can't see is the two registers, r2 and r2 will contain the addresses of the two strings. That is because this hasn't been linked.
As you can see there is a conditional execution instruction 'ite' which loads the parameter register r0 with either register r2 or register r3. So there is no branch in this code.
For a simple processor with pipelining, this can be quite efficient. On a simple pipelined processor, a branch can cause a 'pipeline 'stall' while parts of the pipeline are cleared out. This varies from processor to processor. So I assume gcc has got it right, and generated a better code sequence than executing a branch. I haven't checked.
Spurred on by Lundin, I have come up with this:
void union_bit5(int number) {
union { int n; struct { unsigned :5; unsigned bit :1; }; } tester;
tester.n = number;
if (tester.bit)
dummy_printf("Bit is on");
else
dummy_printf("its off");
}
It does not explicitly include a mask, or bit shifting. It is almost certainly compiler dependent, you'd have to test it to ensure it works (glerk !-(
gcc for ARM generates the same code (bne vs beq, but that can be adjusted) as the OP's solution, so no optimisation, but it removes the mask:
void union_bit5(int number) {
union { int n; struct { unsigned :5; unsigned bit :1; }; } tester;
tester.n = number;
if (tester.bit)
0: f010 0f20 tst.w r0, #32
4: d105 bne.n 1a <union_bit5+0x1a>
dummy_printf("Bit is on");
else
dummy_printf("its off");
6: f240 0000 movw r0, #0
a: f2c0 0000 movt r0, #0
e: f7ff bffe b.w 0 <dummy_printf>
void union_bit5(int number) {
union { int n; struct { unsigned :5; unsigned bit :1; }; } tester;
tester.n = number;
if (tester.bit)
dummy_printf("Bit is on");
12: f240 0000 movw r0, #0
16: f2c0 0000 movt r0, #0
1a: f7ff bffe b.w 0 <dummy_printf>
1e: bf00 nop
For what it's worth:
(number & 0x20)? dummy_printf("its on") : dummy_printf("Bit is off");
gcc for ARM generates exactly the same code as the OP's. It generates branch, and not conditional instructions.
Summary:
...?...:...
operator can compile to code which does not involve branches on the ARM Cortex-M3/4, but may also generate normal branch instructions.I'll add, IMHO the cost of doing a printf is so enormous compared to a bit test, that worrying about optimising a bit test is too small an issue; it fails Amdahl's Law. An appropriate tactic for the bit test is to ensure -O3 or -Os is used.
If you wanted to to do something somewhat perverse (especially for such a trivial problem), but different, which might make the interviewer think, create a lookup table for every byte value. (I don't expect it to be faster ...)
#define BIT5(val) (((val)&0x20)?1:0)
const unsigned char bit5[256] = {
BIT5(0x00),BIT5(0x01), BIT5(0x02),BIT5(0x03),
BIT5(0x04),BIT5(0x05), BIT5(0x06),BIT5(0x07),
// ... you get the idea ...
BIT5(0xF8),BIT5(0xF9), BIT5(0xFA),BIT5(0xFB),
BIT5(0xFC),BIT5(0xFD), BIT5(0xFE),BIT5(0xFF)
};
//...
if (bit5[(unsigned char)number]) {
printf("its on");
} else {
printf("Bit is off");
}
This is a handy technique if there are some complex bit patterns in, for example, a peripheral register, which need converting to a decision, or switch. It is O(1) too
You could combine the two!-)
Upvotes: 6
Reputation: 35
A c learner newbie's try
int number = 16;
if(16 == number&(0x10))
puts("true");
else
puts("false");
Upvotes: 0
Reputation: 6003
Forgive the following answer:
I used to work at a start-up, when the company decided not to pursue a candidate, they come up with a bogus reason to end the interview. Perhaps this was the poster's experience.
asking for kth bit may mean the least significant bit is the zeroth bit so that (number & 1 << 5) would not do. But that was not the issue. He asked for optimization. Sometime the reason you fail an interview has nothing to do with you. In that case it's their loss; there will always be another interview opportunity.
Upvotes: 0
Reputation: 213306
Any attempt to optimize that code falls under the category "premature optimization". If you understand how the compiler translates C to machine code, you would not attempt to optimize that code. I am guessing the interviewer lacked such knowledge.
If we dissect that code, this is what we get:
1<<5
is translated to the literal 32 at compile time. There is absolutely no difference in performance between writing int mask = 1<<5;
and int mask = 32;
, but the latter is far harder to understand.
Further,
if ((number & mask) == 0)
is completely equivalent to if ((number & 32) == 0)
is completely equivalent to if ((number & (1<<5)) == 0)
There exists two cases:
Which of these two that will be picked depends on whether int number = 16;
is modified or not from the point of declaration to the if statement where the masking takes place.
And that's it. Any attempt to write the code differently than in your original example is premature optimization and obfuscation and will not result in any performance difference.
Upvotes: 0
Reputation: 7766
Everybody is shifting to the right. I want to be original and shift to the left:
#define INDEX 5
int number = 16;
if (number<<(sizeof(number)*8-INDEX-1)<0)
printf("Bit #%d is set in %d.\n", INDEX, number);
else
printf("Bit #%d is NOT set in %d.\n", INDEX, number);
This code is ugly and absolutely implementation dependent (the C standard says that the result is undefined). On x86 it works and it is somewhat more efficient because the MSB is always copied in the bit #7 ("sign") of the flags register, which can be tested with a single jns
instruction.
In other words, for INDEX 5, you have:
[...]
shl $0x1F, %eax
test %eax, %eax
jns 8053635
[...]
The original solution of the OP is cleaner, and that's how production code should be like.
Upvotes: 0
Reputation: 796
Are you sure that you should move it 5 bits? How about that:
int n = 16;
printf ("%d\n", (n >> 4) % 2);
Upvotes: 1
Reputation: 1410
One more way is
1: Right shift the number 5 times so that 5th bit becomes 0th from the right(i.e LSB).
2: Now the logic is the numbers with LSB as 1 are odd, and the ones with 0 are even, Check that using %2
If you think the mod operations are much costlier than bit operation, I think it all depends on the compiler. Have a look at this thread
AND faster than integer modulo operation?.
I'm not sure why the interviewer would have asked you to optimize, may be he was expecting the modulus method as answer.
Upvotes: 1
Reputation: 18745
There are two ways to check bit:
if (number & (1 << bit)) { ... }
if ((number >> bit) & 1) { ... }
I think it will be interesting for you: http://graphics.stanford.edu/~seander/bithacks.html
Upvotes: 2
Reputation: 25513
You could use the bit test assember instruction, but it's not impossible that the compiler will pick up on what you're doing and do that anyway.
Apart from that there isn't really anything to optimise, and certainly the only way to see if any of the minor variations on your method that are possible are faster would be to profile.
Here's the code that gcc 4.2.1 -O3 produces for if((number >> 5) & 1))
:
0000000100000ee0 pushq %rbp
0000000100000ee1 movq %rsp,%rbp
0000000100000ee4 shrl $0x05,%edi
0000000100000ee7 notl %edi
0000000100000ee9 andl $0x01,%edi
0000000100000eec movl %edi,%eax
0000000100000eee leave
0000000100000eef ret
and for if(number & (1 << 5))
:
0000000100000ee0 pushq %rbp
0000000100000ee1 movq %rsp,%rbp
0000000100000ee4 shrl $0x05,%edi
0000000100000ee7 notl %edi
0000000100000ee9 andl $0x01,%edi
0000000100000eec movl %edi,%eax
0000000100000eee leave
0000000100000eef ret
So we see that at least gcc 4.2.1 produces identical code in these cases, but it doesn't use the bit test instruction.
Upvotes: 0