Reputation: 2921
To what extent can a JIT replace platform independent code with processor-specific machine instructions?
For example, the x86 instruction set includes the BSWAP
instruction to reverse a 32-bit integer's byte order. In Java the Integer.reverseBytes()
method is implemented using multiple bitwise masks and shifts, even though in x86 native code it could be implemented in a single instruction using BSWAP
. Are JITs (or static compilers for that matter) able to make the change automatically or is it too complex or not worth it due to a poor speed/time tradeoff?
(I know that this is in most cases a micro-optimisation, but I'm interested none the less.)
Upvotes: 1
Views: 249
Reputation: 86
For this case, yes, the hotspot server compiler could do this optimization. The reverseBytes() methods are registered as vmIntrinsics in hotspot. When jit compiler compile these methods, it will generate a special IR node, not compile the whole method. And this node will be translated into 'bswap' in x86. see src/share/vm/opto/library_call.cpp
//---------------------------- inline_reverseBytes_int/long/char/short-------------------
// inline Integer.reverseBytes(int)
// inline Long.reverseBytes(long)
// inline Character.reverseBytes(char)
// inline Short.reverseBytes(short)
bool LibraryCallKit::inline_reverseBytes(vmIntrinsics::ID id) {
assert(id == vmIntrinsics::_reverseBytes_i || id == vmIntrinsics::_reverseBytes_l ||
id == vmIntrinsics::_reverseBytes_c || id == vmIntrinsics::_reverseBytes_s,
"not reverse Bytes");
if (id == vmIntrinsics::_reverseBytes_i && !Matcher::has_match_rule(Op_ReverseBytesI)) return false;
if (id == vmIntrinsics::_reverseBytes_l && !Matcher::has_match_rule(Op_ReverseBytesL)) return false;
if (id == vmIntrinsics::_reverseBytes_c && !Matcher::has_match_rule(Op_ReverseBytesUS)) return false;
if (id == vmIntrinsics::_reverseBytes_s && !Matcher::has_match_rule(Op_ReverseBytesS)) return false;
_sp += arg_size(); // restore stack pointer
switch (id) {
case vmIntrinsics::_reverseBytes_i:
push(_gvn.transform(new (C, 2) ReverseBytesINode(0, pop())));
break;
case vmIntrinsics::_reverseBytes_l:
push_pair(_gvn.transform(new (C, 2) ReverseBytesLNode(0,pop_pair())));
break;
case vmIntrinsics::_reverseBytes_c:
push(_gvn.transform(new (C, 2) ReverseBytesUSNode(0, pop())));
break;
case vmIntrinsics::_reverseBytes_s:
push(_gvn.transform(new (C, 2) ReverseBytesSNode(0, pop())));
break;
default:
;
}
return true;
}
and src/cpu/x86/vm/x86_64.ad
instruct bytes_reverse_int(rRegI dst) %{
match(Set dst (ReverseBytesI dst));
format %{ "bswapl $dst" %}
opcode(0x0F, 0xC8); /*Opcode 0F /C8 */
ins_incode( REX_reg(dst), OpcP, opc2_reg(dst) );
ins_pipe( ialu_reg );
%}
Upvotes: 1