Qazi
Qazi

Reputation: 355

Verilog strange simulation results post synthesis

I am facing a strange problem. The code is for a simple ALU. Only code of interest is pasted here:

   always @(posedge clk or posedge rst)
   begin
        if (rst == 1) begin
           mul_valid_shr = 3'b000; 
        end else begin
            if (op_mul_i == 1) begin
                mul_valid_shr = 3'b111;
            end else begin
                mul_valid_shr <= mul_valid_shr << 1;
            end
        end
   end

And outside the always block:

assign mul_valid = mul_valid_shr[2];

The POST SYNTHESIS FUNCTIONAL SIMULATION with my test bench has following results:

enter image description here

The reset is already low, why is the sim not working for the first time but working fine for 2nd and third time? If I trigger the op_mul_i before 100ns mark, even if rst is low, even the mul_result stops working on the first time.

Any guesses are welcome.

UPDATE: FULL CODE HERE: https://www.edaplayground.com/x/28Hx

Upvotes: 2

Views: 1604

Answers (4)

patstew
patstew

Reputation: 2001

The Xilinx simulator simulates the FPGA global reset for the first 100ns of any post-synthesis simulation, so you basically have to hold your logic in reset and clock for at least 100ns to get sensible results. This is mentioned in UG900 on pg 13.

Upvotes: 4

Greg
Greg

Reputation: 19122

Verilog has has the concepts of nondeterminism and race condtions. Below are exert from various version of Verilog and SystemVerilog explaining the concepts:

  • IEEE Std 1364-1995 § 5.4.2 Nondeterminism
  • IEEE Std 1364-2001 § 5.4.2 Nondeterminism
  • IEEE Std 1800-2012 § 4.8 Nondeterminism

One source of nondeterminism is the fact that active events can be taken off the queue and processed in any order. Another source of nondeterminism is that statements without time-control constructs in behavioral blocks do not have to be executed as one event. Time control statements are the # expression and @ expression constructs (see 9.7 [9.4 for IEEE1800]). At any time while evaluating a behavioral statement, the simulator may suspend execution and place the partially completed event as a pending active event on the event queue. The effect of this is to allow the interleaving of process execution. Note that the order of interleaved execution is nondeterministic and not under control of the user.

  • IEEE Std 1364-1995 § 5.5 Race conditions
  • IEEE Std 1364-2001 § 5.5 Race conditions
  • IEEE Std 1800-2012 § 4.8 Race conditions

Because the execution of expression evaluation and net update events may be intermingled, race conditions are possible:

assign p = q;
initial begin
  q = 1;
  #1 q = 0;
  $display(p);
end

The simulator is correct in displaying either a 1 or a 0. The assignment of 0 to q enables an update event for p. The simulator may either continue and execute the $display task or execute the update for p, followed by the $display task.

In short this means an always block that triggers on clk can be evaluated before or after op_mul_i is updated even though clk and op_mul_i are changed in the same time-step. This nondeterministic and race condition behaviors are intentional; allowing the language a way to mimic the same behavior that can happen with critical paths on FPGA and silicon.

Regardless the solution and best practice is to have an offset (time or scheduler region) between the clock and input stimulus. You can use a time offset such at the ± 1 on the first # delay; like I suggest in my comment. Or assign the input stimulus with non-blocking assignments (<=); which will always be updated after the clock and anything dependent on the clock. (This is why flops should be assigned with non-blocking). Which route you take is up to you or your team lead to decide.

Upvotes: 1

Serge
Serge

Reputation: 12384

you created an asynchronous flop with op_mul_i as an asynchronous signal. It is modified in your initial block and this modification is not synchronized with clk. So, it looks like a race to me. And the hardware is correct ignoring some steps.

So, your simulation results were probably correct due to a simulation artifact. I guess that the right rtl approach would be to sync the signal with the clock by providing yet another flop for this signal.

Other than that you can try to play with nonblocking assignments or #0 delays in your initial block in simulation for this signal.

Upvotes: 0

user8238651
user8238651

Reputation: 1

How is op_mul_i generated? Is it synchronous to clk? I ask because in the second part of your simulation, I see mul_valid being driven to logic-1 when op_mul_i is logic-1. If it was synchronous, I would expect mul_valid to be logic-1 at the clock edge next to the 200ns edge. As this is post synthesis, I suspect metastability causing this issue. At 100ns, op_mul_i is changing within the failure window, and the clock edge does not detect op_mul_i as logic-1, and hence you don't see anything.

Synchronize op_mul_i to clk, and use the synchronized signal to drive mul_valid_shr. Also, don't use blocking statements in a sequential block.

Hope that helps. VK

Upvotes: 0

Related Questions