Computer architecture, 4-way cache hit/replacement confustion

Question

I am trying to implement a 4-way cache in verilog, but I have some confusions on a cache look up scenario. Say I have the following specs:

     C = ABS
        C = 1KB
        A = 4
        B = 128 bits (4 DWORD)
        S = C/AB = (8192)/(4*128) = 16

        offset = lg(B) = 7 bits
        index  = lg(S) = 4 bits
        tag    = 32 - offset - index = 21 bits

C: Capacity in data arrays
A: Way-Associativity
B: Block Size (Cacheline)
   How many bytes in a block
S: Number of Sets:
   A set contains blocks sharing the same index

And in my memory I have:

H  0000_0000   0000 0000 0000 0000 0000 0000 0|000 0|000 0000    
E  0000_0004
L  0000_0008
L  0000_000C
O  0000_0010   0000 0000 0000 0000 0000 0000 0|000 0|001 0000
   0000_0014
W  0000_0018
O  0000_001C
R  0000_0020   0000 0000 0000 0000 0000 0000 0|000 0|010 0000    
L  0000_0024
D  0000_0028
!  0000_002C

Say I am on a fresh start and I request a load word at address 0x0000_0000, since my cache is empty, I will write HELL to my cacheline on index 0 in one of the 4 array.

Then I request another load word at address 0x0000_0010, and I am confused at this point.

My question is that since my tag and index are the same, it is a hit, but my cacheline does not have the word O in it. What should my cache do in this situation? Do I kick out HELL and write O WO in same array? If so how should I differentiate those two addresses since we are only looking at the tag and index bit?

The other way I was thinking is since it is a hit, the cache should not be evicted because we found a match. But that match does not have the actual word I am requesting. So this logic is wrong if I don't do a replacement, but my cache is a hit. And I just got into a looping-logic between evict and cache hit.

Peter Cordes · Accepted Answer

After your update, the problem becomes clear: B should be the size in bytes, but you've used the size in bits. The correct number of offset bits in addresses is log2(16) = 4. If you move the | delimiters in your diagram to the correct position, everything will work out fine: you'll see that each 16B block has an index that's 1 higher than the previous.

H  0000_0000   0000 0000 0000 0000 0000 0000|0000|0000    
E  0000_0004
L  0000_0008
L  0000_000C
O  0000_0010   0000 0000 0000 0000 0000 0000|0001|0000
   0000_0014
...

I also noticed that your addresses were previous 36 bits, with an extra block of 0000. I double-checked the number of index bits, too: 4 index bits means each of your 16 sets of 4 ways. 16 * 4 * 16B = 1024B, so that's right for your 1kiB cache.

You're having the same problem as in your previous question, with multiple chunks of memory going to the same cache line, but for a different reason. This is a sign that you got something seriously wrong, because one cache line must always be large enough to hold all the data in the addresses with the same tag+index bits.

Computer architecture, 4-way cache hit/replacement confustion

Answers (1)

Related Questions