Swapnil
Swapnil

Reputation: 1

Rocket core (riscv) timing not meeting

I am trying to synthesize the rocket core in Design compiler using TSMC28HPM library. The timing is not getting met !

Targetted Frequency : 500MHz

Without FPU : Achievable freq. 400MHz With FPU : Achievable freq. 200MHz

Currently my constraints just has the clock defined. Are there any timing exceptions for the design ?

What is the scenario assumed/tested to achieve 1 GHz ?

Failing paths summary: Startpoint: RocketTile_1_core/div/divisor_reg_* (rising edge-triggered flip-flop clocked by clk) Endpoint: RocketTile_1_core/div/remainder_reg_* (rising edge-triggered flip-flop clocked by clk) (VIOLATED) -0.76 Startpoint: RocketTile_1_core/div/remainder_reg_* (rising edge-triggered flip-flop clocked by clk) Endpoint: RocketTile_1_core/div/remainder_reg_* (rising edge-triggered flip-flop clocked by clk) (VIOLATED) -0.76 Startpoint: RocketTile_1_HellaCache_1/s2_store_bypass_reg (rising edge-triggered flip-flop clocked by clk) Endpoint: RocketTile_1_core/mem_reg_wdata_reg_* (rising edge-triggered flip-flop clocked by clk) (VIOLATED) -0.60 Startpoint: RocketTile_1_HellaCache_1/d (rising edge-triggered flip-flop clocked by clk) Endpoint: RocketTile_1_core/mem_reg_wdata_reg_* (rising edge-triggered flip-flop clocked by clk) (VIOLATED) -0.60 More failing paths to mem_reg_wdata_reg_* Startpoint: RocketTile_1_core/mem_ctrl_branch_reg (rising edge-triggered flip-flop clocked by clk) Endpoint: RocketTile_1_dtlb/r_refill_tag_reg_* (rising edge-triggered flip-flop clocked by clk) (VIOLATED) -0.54 Startpoint: uncore_PRCI_1/time_reg_* (rising edge-triggered flip-flop clocked by clk) Endpoint: uncore_PRCI_1/time_reg_* (rising edge-triggered flip-flop clocked by clk) (VIOLATED) -0.52 Startpoint: uncore_outmemsys/l1tol2net/acqNet/arb/T_1236_reg_* (rising edge-triggered flip-flop clocked by clk) Endpoint: uncore_outmemsys/L2BroadcastHub_1/BufferedBroadcastAcquireTracker_2_1/data_buffer_4_reg_* (rising edge-triggered flip-flop clocked by clk) (VIOLATED) -0.51 Most of the violations are from t_1236_reg_*

Upvotes: 0

Views: 440

Answers (2)

turbo_fingers
turbo_fingers

Reputation: 11

There are many areas you need to consider for logic synthesis. You're using DC, are you using a physical flow with a floor plan? A proper physical flow will give you more accurate modelling of realistic wireload models.... else you might be synthesising something that ends out being unimplementable at P&R

You make no mention of your clock tree design.... what uncertainty are you emulating the effects of your (eventual) physical clocktree, there will be some skew!

I am not familiar with the tsmc library you are using, is there a variety of VT cells , low VT for fast logic at the sacrifice of leakage current. In a spread VT setup you'd expect to see the number of low VT cells increase as the clock speed spec increases. You may b using all regular or high VT cells! Which would give you a slower potential design speed.

It's always worth keeping an eye on heavily loaded cells, and areas of the architecture that bloat in area as the clock speed spec increases. Logic bloat (increased area) is a classic sign of struggling timing closure and will lead to problems at P&R

Are you inserting DFT yet? Bare in mind if you aren't, that DC may be using scan cells already to achieve timing closure, this will lead to DFT problems down the line when you DO try to insert scan.

For a more informed answer, a complete timing report would be needed showing all of the cells between the timing start and end points

As already mentioned in other answers, pipelining of the data path will be critical, as probably will be memory timings.

Caution is always advised with a non-physical synthesis flow, you should always consider a PPA study.

Good luck

Upvotes: 0

Chris
Chris

Reputation: 3987

Retiming of the FPU is mandatory - it's described combinationally and padded out with a parameterizable number of registers.

I also recommend playing with the other parameters to see if you can find a more favorable setup (TLB entries, BTB entries, etc.). Remove ISA extensions like the div unit and FPU since those are showing up in your critical paths. Also be aware that the uncore/L2 should probably be placed in its own clock domain.

However, since Rocket has reached >1.5GHz with full ISA support in IBM 45nm, I'm surprised that you aren't reaching 500 MHz.

Upvotes: 0

Related Questions