David Given
David Given

Reputation: 13701

gcc internals: calculating instruction costs

I'm working on a gcc backend for an architecture. The architecture has instructions for indexed array access; so, ld r0, (r1, r2) is equivalent to r0 = r1[r2] where r1 is a int32_t*.

I'm representing this in the .md file with the following pattern:

(define_insn "*si_load_indexed"
  [
    (set
      (match_operand:SI 0 "register_operand" "=r")
      (mem:SI
        (plus:SI
          (mult:SI
            (match_operand:SI 1 "register_operand" "%r")
            (const_int 4))
          (match_operand:SI 2 "register_operand" "r"))))
  ]
  ""
  "ld %0, (%2, %1)"
  [(set_attr "length" "4")]
)

However, the instruction is never actually being emitted. Looking at the debug output from the instruction combining stage, I see this:

Trying 8, 9 -> 10:
Successfully matched this instruction:
(set (reg:SI 47 [ *_5 ])
    (mem:SI (plus:SI (mult:SI (reg/v:SI 43 [ b ])
                (const_int 4 [0x4]))
            (reg:SI 0 r0 [ a ])) [2 *_5+0 S4 A32]))
rejecting combination of insns 8, 9 and 10
original costs 8 + 4 + 4 = 16
replacement cost 32

If I've read this correctly, it indicates that the instruction pattern has been matched, but the instruction has been rejected due to being more expensive than the original instructions.

So, how is it calculating the cost of my instruction? Where's it getting that 32 from (which seems weirdly high)? How do I persuade gcc to actually use this instruction?

Upvotes: 0

Views: 633

Answers (1)

ams
ams

Reputation: 25599

The proper place to asked questions like this is [email protected]. They're very helpful if you ask intelligent questions. :)

You should read the Internals Manual first, of course. The relevant section is here: http://gcc.gnu.org/onlinedocs/gccint/Costs.html#Costs

I believe you need to look at TARGET_RTX_COSTS, but I could be wrong. The default behaviour is to estimate the cost by stepping through the RTL recursively and adding up the operations, I think, but it's convoluted and a while since I looked at it (look at rtx_costs).

Other ports add instruction attributes to help them judge costs.

Upvotes: 2

Related Questions