pmor
pmor

Reputation: 6286

What are GCC and Clang options to generate Intel DL Boost bfloat16 instructions?

For this code:

#include <stdfloat>

std::bfloat16_t foo(std::float32_t f)
{
    return f;
}

GCC generates this code:

foo(_Float32):
        sub     rsp, 8
        call    __truncsfbf2
        add     rsp, 8
        ret

Here we see call __truncsfbf2, which is (?) a software implementation (libgcc/soft-fp/truncsfbf2.c).

It is known that:

Hence, when targeting Cascade Lake I expect to see Intel DL Boost bfloat16 instruction VCVT... (instead of call __truncsfbf2).

I've already tried to add -march=cascadelake. However, GCC still generates call __truncsfbf2.

Are there any GCC options to generate Intel DL Boost bfloat16 instructions?

The same question goes for Clang.

Upvotes: 1

Views: 53

Answers (0)

Related Questions