What is the difference between vfmaq_f32 and vmlaq_f32 in the neon instruction set, and the difference in running speed and accuracy

Question

hello，What is the difference between vfmaq_f32 and vmlaq_f32 in the neon instruction set, and the difference in running speed and accuracy

On macOS ARM64, the code runs consistently

#include
#include
using namespace std;
int main(){
    float a = 12.3839467819;
    float b = 21.437678904;
    float c = 4171.42144;
    printf("%.17f
",a);
    printf("%.17f
",b);
    printf("%.17f
",c);


    printf("%.17f
",a+b*c);

    float32x4_t a_reg = vdupq_n_f32(a);
    float32x4_t b_reg = vdupq_n_f32(b);
    float32x4_t c_reg = vdupq_n_f32(c);
    float32x4_t res_reg = vfmaq_f32(a_reg, b_reg, c_reg);
    float res[4] = {0.f};
    vst1q_f32(res,res_reg);
    printf("%.17f
",res[0]);


    res_reg = vmlaq_f32(a_reg, b_reg, c_reg);
    vst1q_f32(res,res_reg);
    printf("%.17f
",res[0]);


    res_reg = vmulq_f32(b_reg, c_reg);
    res_reg = vaddq_f32(res_reg, a_reg);
    vst1q_f32(res,res_reg);
    printf("%.17f
",res[0]);
    return 0;
}

What is the difference between vfmaq_f32 and vmlaq_f32 in the neon instruction set, and the difference in running speed and accuracy

Answers (1)

Related Questions