Brahadeesh
Brahadeesh

Reputation: 2255

verilog IEEE 754 single precision to integer conversion

I am trying to convert a IEEE single precision binary format number to an integer. I am using the following loop:

for(i=22;i>=0;i=i-1)
begin
   a1=in1[i]*(2**(i-23));
end
a1=a1+1;
a1=a1*(2**(in1[30:23]-8'b01111111));
a1=((-1)**(in1[32]))*a1;

I need to do this 7 more times in my program. My question is if there is a library to do this, that takes a 32 bit input and gives an integer output? If yes how do I include that function in my program? Thank you.

update: will the snippet above work correctly?

Upvotes: 2

Views: 10361

Answers (2)

user597225
user597225

Reputation:

For behavioral code use either $rtoi() or $realtobits()

real in1;
integer     a1;
wire [63:0] b1;
a1 = $roti(in1); //Truncates fractional part
b1 = $realtobits(in1); //Probably not what you want

You can use $bitstoreal() if you need to cast a bit vector to a real type.

EDIT: So if I follow your comments correctly, you're building a model of a floating-point ALU that works on 32-bit data values. In this case you could use real data types since Verilog can handle this format natively. Of course, you won't be able to detect certain situations

task [31:0] realAdd(input [31:0] in1, input [31:0] in2, output [31:0] out);
begin

real rIn1,rIn2,rOut;
rIn1 = $bitstoreal(in1);
rIn2 = $bitstoreal(in2);
rOut = rIn1 + rIn2;

out = $realtobits(rOut);

end
endtask

These functions all use double precision so you'll need to do some trivial bit extensions to handle single precision inputs, and some non-trivial bounds checking/truncation on the output. You can avoid this by using SystemVerilog, which has the $bitstoshortreal()/$shortrealtobits() functions that work on single precision values.

If you want hardware for this, Computer Organization & Design has a description of a multi cycle implementation. As Andy posted, there may be better resources out there for your case. These are not simple to design.

Upvotes: 2

Andy
Andy

Reputation: 4866

One way to do it would be to:

  1. re-pack the operands in 64-bit double precision format
  2. convert the values to reals with $bitstoreal
  3. use native verilog floating point math to operate on the reals
  4. convert the reals back to bits with $realtobits
  5. convert the 64-bit double back to single-precision format. The value needs to be clipped if it is outside the representable range (there are both "too far from zero" and "too close to zero" cases that need to be clamped).

If you want to convert the floats to fixed point using the strategy in the question, keep in mind that Verilog does not have native fixed point support. You cannot meaningfully evaluate 2^(negative exponent) as an integer, you'll just get zero. Either restructure the algorithm so the exponents are always positive, or use right-shifts (>>) rather than multiplication by a negative power of two.

Upvotes: 2

Related Questions