Viking
Viking

Reputation: 348

How to reuse BRAM once it's not needed by module?

I'm working on a (seemingly) simple project as a learning exercise: connecting an SSD1331-based 96x64 PMOD display via iCEstick (Lattice iCE40HX-1k FPGA) to PC so I can send some RGB565-encoded image through USB to be shown on said display.

Thing is, SSD1331 display requires an initialization procedure just to get to the "clear black screen" state. There's about 20 commands to be shifted into display controller; length varies between 1 and 5 bytes, total is 44 bytes.

So far I wrote Verilog pwr_on module with FSM for shifting commands into PMOD in the right sequence; values for the commands are defined as localparam. Everything works fine but there's always a but. I figured all those command constants get stored in LUTs (I'm not inferring any RAM blocks so where else would they go, right?), and with only 1,280 LUTs available in iCE40HX1k using a hundred or so of them for init procedure that takes about 150ms and is never needed until the next reset seems to be a waste.

Now, I can see the following ways to deal with this problem:

  1. Don't implement init sequence in FPGA at all; instead, send those commands via USB.
    Simple but not that interesting; after all, I'm trying to learn FPGA programming, not Linux drivers.
  2. Take advantage of SB_WARMBOOT and multi-configuration.
    iCE40HX can have up to 4 configurations stored in EEPROM; SB_WARMBOOT primitive basically lets you jump between them at will. I could program init procedure in configuration 0 and once it's done jump to configuration 1 with USB support, thus having a clean slate. However, I need to hold at least 3 display PMOD pins (pmod_enable, vcc_enable and pmod_rstn) high while in transition between the configurations. I cannot find any means to do that; if anybody knows please send me in the right direction.
  3. Store commands data in BRAM.
    HX1K has 16 RAM4K blocks (each storing 4096 bits) so even one of them should provide plenty of room for 44 bytes of command data without spending valuable LUTs.

Option 3 looks simple enough. However, being scrooge about my resources I'd love to have that RAM4K block available for other tasks once init is done. Now, it seems to me that Verilog synthesizer (I'm using yosys) is completely oblivious of the fact that when pwr_on module pulls done wire high, the BRAM cell it's been attached to can be reused when inferring other logic.

One solution that comes to mind is to allocate that BRAM block in a separate module, fill it with the data I need for init and wire it to pwr_on module, then rewire it to other modules as needed. Yet this approach looks ugly for a few reasons, thus the question: is there a trick I'm missing? How could I use one BRAM block in, let's say, SB_RAM512x8 configuration for one module and then reuse it as SB_RAM256x16 for another?

Upvotes: 1

Views: 623

Answers (3)

BrunoLevy
BrunoLevy

Reputation: 2633

For a similar project (driving a SSD1351 OLED display with an ICEStick), I wrote the initialization sequence as a "wired ROM", with a big case() statement, as follows:

module SSD1351InitROM(
    input  wire [5:0] address,
    output reg  [8:0] data
);
   always @(*) begin
       case(address)
          0:  data=9'h0_02; // Reset low during 0.5s
      1:  data=9'h0_01; // Reset high during 0.5s
      2:  data=9'h0_fd; 3: data=9'h1_12; // Unlock driver
      4:  data=9'h0_fd; 5: data=9'h1_b1; // unlock commands
      6:  data=9'h0_ae; // display off
      7:  data=9'h0_a4; // display mode off
      8:  data=9'h0_15;  9: data=9'h1_00; 10: data=9'h1_7f; // column address
      11: data=9'h0_75; 12: data=9'h1_00; 13: data=9'h1_7f; // row address
      14: data=9'h0_b3; 15: data=9'h1_f1; // front clock divider (see section 8.5 of manual)
      16: data=9'h0_ca; 17: data=9'h1_7f; // multiplex
      18: data=9'h0_a0; 19: data=9'h1_74; // remap,data format,increment
      20: data=9'h0_a1; 21: data=9'h1_00; // display start line
      22: data=9'h0_a2; 23: data=9'h1_00; // display offset
      24: data=9'h0_ab; 25: data=9'h1_01; // VDD regulator ON
      26: data=9'h0_b4; 27: data=9'h1_a0; 28: data=9'h1_b5; 29: data=9'h1_55; // segment voltage ref pins
      30: data=9'h0_c1; 31: data=9'h1_c8; 32: data=9'h1_80; 33: data=9'h1_c0; // contrast current for colors A,B,C
      34: data=9'h0_c7; 35: data=9'h1_0f; // master contrast current
      36: data=9'h0_b1; 37: data=9'h1_32; // length of segments 1 and 2 waveforms
      38: data=9'h0_b2; 39: data=9'h1_a4; 40: data=9'h1_00; 41: data=9'h1_00; // display enhancement
      42: data=9'h0_bb; 43: data=9'h1_17; // first pre-charge voltage phase 2
      44: data=9'h0_b6; 45: data=9'h1_01; // second pre-charge period (see table 9-1 of manual)
      46: data=9'h0_be; 47: data=9'h1_05; // Vcomh voltage
      48: data=9'h0_a6; // display on // a6 = normal, a7 = inverse, a5 = all on
      49: data=9'h0_af; // display mode on
          default: data=9'h0_00; // end of program
       endcase
   end

endmodule

I was also concerned about eating too many LUTs for just the init sequence, but it stays reasonable, here is the device usage for the whole project, that displays a little animation on the SSD1351:

Info: Device utilisation:
Info:            ICESTORM_LC:   246/ 1280    19%
Info:           ICESTORM_RAM:     0/   16     0%
Info:                  SB_IO:    11/  112     9%
Info:                  SB_GB:     6/    8    75%
Info:           ICESTORM_PLL:     1/    1   100%
Info:            SB_WARMBOOT:     0/    1     0%

I guess this leaves enough resources for the UART that you'll need to decode image data from USB (typically around 100 LUTs I'd say).I'm using the one from swapforth/J1: https://github.com/jamesbowman/swapforth/blob/master/j1a/icestorm/uart.v (easy to understand, and not LUT-hungry).

The complete sources of my project (and others) are available in my github page: https://github.com/BrunoLevy/learn-fpga/

Disclaimer: I'm a beginner in VERILOG, my style is probably far from perfect...

Upvotes: 0

Baard
Baard

Reputation: 829

Multiplex the read address to the EBR used for PMOD configuration data

EBR's of ice40 can not, in my knowledge, change the WRITE_MODE and READ_MODE while running (please correct me if I'm wrong). Hence, I would suggest to instantiate your EBR in the configuration you want to use after initiation of the PMOD. The contents of the EBR must include the configuration data for the PMOD, specified the usual way via INIT_0 through INIT_F.

The read address to the EBR need to be a mux of an address from the FSM controlling PMOD-initiation, and the address to use after initiation, this will only cost around 8 LUTs.

Upvotes: 1

Oldfart
Oldfart

Reputation: 6259

I use Xilinx, but the differences between the basic building blocks in FPGA's are small.

I did a quick search for "Lattice BRAM" and found that Lattice memories are, just as in Xilinx, dual-ported. This means you can access the memory from two locations. You should check if your device has that option.

If so, the solution is to instance a dual-ported memory and use it first as ROM to initialize the display. Then use the other port to use the BRAM as normal memory. The great advantage is that all the logic for two port access is already on the silicon, so you don't have to use any of the programmable logic for that.

Beware that only a device re-configuration will restore the contents. A normal reset will not.

Remains the problem of initializing the RAM contents at start-up. I know it can be done in Xilinx so you have to look for the equivalent Lattice application note.

Upvotes: 0

Related Questions