Query on VHDL generics in packages

Question

I have written a simple VHDL code to add two matrices containing 32 bit floating point numbers. The matrix dimensions have been defined in a package. Currently, I specify the matrix dimensions in the vhdl code and use the corresponding type from the package. However, I would like to use generic in the design to deal with matrices of different dimensions. For this I would have to somehow use the right type defined in the package. How do I go about doing this? My current VHDL code is as below.

library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
use IEEE.NUMERIC_STD.ALL;
use work.mat_pak.all;

entity newproj is
    Port ( clk : in  STD_LOGIC;
           clr : in  STD_LOGIC;
           start : in  STD_LOGIC;
           A_in : in  t2;
           B_in : in  t2;
           AplusB : out  t2;
           parallel_add_done : out  STD_LOGIC);
end newproj;

architecture Behavioral of newproj is
COMPONENT add
  PORT (
    a : IN STD_LOGIC_VECTOR(31 DOWNTO 0);
    b : IN STD_LOGIC_VECTOR(31 DOWNTO 0);
    clk : IN STD_LOGIC;
    sclr : IN STD_LOGIC;
    ce : IN STD_LOGIC;
    result : OUT STD_LOGIC_VECTOR(31 DOWNTO 0);
    rdy: OUT STD_LOGIC
  );
END COMPONENT;


signal temp_out: t2 := (others=>(others=>(others=>'0')));
signal add_over: t2bit:=(others=>(others=>'0'));
signal check_all_done,init_val: std_logic:='0';
begin
    init_val <= '1';
    g0: for k in 0 to 1 generate
                g1: for m in 0 to 1 generate
                            add_instx: add port map(A_in(k)(m), B_in(k)(m),     clk, clr, start, temp_out(k)(m), add_over(k)(m));
                end generate;   
        end generate;

        g2: for k in 0 to 1 generate
                g3: for m in 0 to 1 generate
                        check_all_done <= add_over(k)(m) and init_val;
                end generate;   
        end generate;

        p1_add:process(check_all_done,temp_out)
        begin
            AplusB <= temp_out;
            parallel_add_done <= check_all_done;            
        end process;

end Behavioral;

My package is as below

library IEEE;
use IEEE.STD_LOGIC_1164.all;
use IEEE.NUMERIC_STD.ALL;

    package mat_pak is


    subtype small_int is integer range 0 to 2;

    type t22 is array (0 to 1) of std_logic_vector(31 downto 0);
    type t2 is array (0 to 1) of t22; --2*2 matrix

    type t22bit is array (0 to 1) of std_logic;
    type t2bit is array (0 to 1) of t22bit; --2*2 matrix bit

    type t33 is array (0 to 2) of std_logic_vector(31 downto 0);
    type t3 is array (0 to 2) of t33; --3*3 matrix

end mat_pak;

Any suggestions would be welcome. Thank you.

user1155120 · Accepted Answer

There are some logical issues with your design.

First, there's some maximum number of ports for a sub-hierarchy a design can tolerate, you have 192 'bits' of matrix inputs and outputs. Do you really believe this number should be configurable?

At some point it will only fit in the very large FPGA devices, and shortly thereafter not fit there either.

Imagining some operation taking a variable number of clocks in add and parallel_add_done signifies when an aplusb datum is available comprised of elements of the matrix array contributed by all instantiated add components, the individual rdy signals are ANDed together. If the adds all take the same amount of time you could take the rdy from anyone of them (If you silicon is not that deterministic it would not be usable, there are registers in add).

The nested generate statements all assign the result of the AND between add_over(k,m) and init_val (which is a synthesis constant of 1). The effect or wire ANDing add_over(k.m) bits together (which doesn't work in VHDL and is likely not achievable in synthesis, either).

Note I also showed the proper indexing method for the two dimensional arrays.

Using Jonathan's method of sizing matrixes:

library ieee;
use ieee.std_logic_1164.all;

package mat_pak is

    type matrix  is array (natural range <>, natural range <>)
               of std_logic_vector(31 downto 0);
    type bmatrix is array (natural range <>, natural range <>) 
               of std_logic;                      
end package mat_pak;

library ieee;
use ieee.std_logic_1164.all;
use work.mat_pak.all;

entity newproj is
    generic ( size:  natural := 2 );
    port ( 
        clk:                in  std_logic;
        clr:                in  std_logic;
        start:              in  std_logic;
        a_in:               in  matrix (0 to size - 1, 0 to size - 1);
        b_in:               in  matrix (0 to size - 1, 0 to size - 1);
        aplusb:             out matrix (0 to size - 1, 0 to size - 1);
        parallel_add_done:  out std_logic
    );
end entity newproj;

architecture behavioral of newproj is
    component add
        port (
            a:      in  std_logic_vector(31 downto 0);
            b:      in  std_logic_vector(31 downto 0);
            clk:    in  std_logic;
            sclr:   in  std_logic;
            ce:     in  std_logic;
            result: out std_logic_vector(31 downto 0);
            rdy:    out std_logic
        );
    end component;

    signal temp_out: matrix (0 to size - 1, 0 to size - 1) 
                :=  (others => (others => (others => '0')));
    signal add_over: bmatrix (0 to size - 1, 0 to size - 1)
                := (others => (others => '0'));
begin
g0: 
    for k in  0 to size - 1 generate 
g0x: 
        for m in 0 to size - 1 generate
            add_instx: add 
                port map (
                    a => a_in(k,m),
                    b => b_in(k,m),
                    clk => clk,
                    sclr => clr,
                    ce => start,
                    result => temp_out(k,m),
                    rdy => add_over(k,m)
                );
        end generate;   
    end generate;

    aplusb <= temp_out;

p1_add:
    process (add_over)
        variable check_all_done: std_logic;
    begin
        check_all_done := '1';
        for k in 0 to size - 1 loop
            for m in 0 to size -1 loop
                check_all_done := check_all_done and add_over(k,m);
            end loop;
        end loop;
        parallel_add_done <= check_all_done;
    end process;

end architecture behavioral;

We find that we really want to AND the various rdy outputs (add_over array) together. In VHDL -2008 this can be done with the unary AND, otherwise you're counting on a synthesis tool to flatten the AND (and they generally do).

I made the assignment to aplusb a concurrent assignment.

So I dummied up an add entity with an empty architecture, the above then analyzes, elaborates and simulates, which shows that none of the connectivity has length mismatches anywhere.

Query on VHDL generics in packages

Answers (2)

Related Questions