OptimusPrime
OptimusPrime

Reputation: 143

Truncated signed Fixed point conversion from Q2.28 to Q2.14 in verilog

I have a Verilog module that provides wire output as signed Q2.28 Fixed point method. That is 2 bits for the integer part and 28 bits for the decimal part. The range of numbers it can represent is [-2,2). I have a 16-bit bus to carry these results. I want 2 bits for the integer part and 14 bits for the decimal part to keep the range same as the original data. What are the best practices for rounding off and truncating signed Q2.28 fixed-point numbers to signed Q2.14 fixed-point numbers? Simply chopping off the 14 LSB might corrupt the mean of the data.

Upvotes: 1

Views: 711

Answers (1)

nguthrie
nguthrie

Reputation: 2685

It depends on how you want to do the rounding. The easiest thing to do is rounding towards the nearest integer, but when you have to round off exactly 0.5 there is a bias which will slightly adjust your mean (very long discussion of it here https://en.wikipedia.org/wiki/Rounding).

Assuming you are ok with that then the implementation is as easy as adding the first bit you are dropping to the number, but you will have to deal with the possibility of overflow.

module round (input signed [29:0] full_res_in, output logic signed [15:0] rounded_out);
    logic signed [16:0] rounded;
    always_comb begin
        // Add the first sub-lsb to round to nearest integer
        // and clip off the extra 14 bits
        rounded = (full_res_in >>> 'd14) + full_res_in[13];

        // If overflow occurs clip at the max positive value
        // No need to check the other direction since we are always
        // adding a positive value
        if(rounded == 17'h08000) rounded_out = 16'h7FFF;
        else                     rounded_out = 16'(rounded);
    end
endmodule

Upvotes: 1

Related Questions