How do I deal with numbers less than 32 bit in a 32 bit system?

Question

I'm attempting to simulate a 32 bit computer under a very scuffed architecture I have come up with on my own. I am probably doing everything wrong but it's just a fun thing I'm doing to teach myself C. I am encountering a slight issue where I have no idea how many bytes of a number I should save to memory.

At the moment I have an instruction that looks like this: CODE, (addressing info), add-a, add-b, add-c. the opcode and addressing info is 4 bits long, and the addresses are 8 bits long. If I add 2 32 bit numbers (b and c) they get saved at address a. The issue arises when I have a number that is less than 32 bits. For example, if I have an array of 1 byte chars and for whatever reason I want to add 1 to one of the numbers, when I save the 1 byte char back to that array, it would be written as a 32 bit number, thus overwriting the 3 subsequent chars.

I'm not really sure the best way to tackle this issue but I have a few ideas.

Idea 1: Just do everything in 32 bit chunks. Let the programmer deal with the issue themselves. (do some funky bitwise manipulation to fit the 1 byte char back into the array. maybe with a mask) I don't want to do this as it would make the code messy.

Idea 2: Only allow addresses every 32 bit . If every number is 32 bits long, then no number will be overwritten. This sucks as as far I can tell, nothing does this. It would make saving smaller numbers take up 4 times more memory than they need to.

Idea 3: Stop working with 32 bit numbers. Only ever add, subtract, store, get 8 bit numbers. This would work and probably be less messy but would also be very annoying. Adding 32 bit numbers would suddenly take at least 4 lines of code, the programs would then run slower. It would also mean moving lines of code around would also take at least 4 lines of code as each line of code is 4 bytes long.

Basically I have no idea what I'm doing and I can find anyone online talking about this. I'm sure either there is something glaringly obvious, or I'm doing something stupid and I will need to redesign the whole system...

also side note, I'm not sure if this is the correct place to ask this kind of question but if it isn't I would love to know where is

Erik Eidt · Accepted Answer

All the ideas you mention seem to share a common concept, which is to limit what the hardware does and make software make up the rest of its desires/requirements by (a) assembling larger items from smaller storage units, and vice versa, (b) packing smaller items into larger storage units.

Generally speaking this is how computation works anyway, providing only limited capabilities in hardware, and making software make up any shortfall. The limited capabilities, ideally, are well matched to common software patterns, such as for strings, integer of various sizes, floats, etc..

Where the line is drawn between hardware-built-in capabilities and software compensation has been changed many times by many processors over the years.

Software generally has to do both of these with any machine organization existing today. If you want an array of Boolean values, then you would probably want to pack them into bytes (or words) and set/extract bits from them, which is (b). On the other hand if you want long strings or multiword numeric data, then software assembles some larger number of storage units into a whole, which is (a).

Modern 64-bit hardware offers at least 1-byte, 2-byte, 4-byte and 8-byte data (modulo vectors). By offering these data sizes, we mean that it provides for instructions that directly operate on these sizes, i.e. single instructions that do useful things with them.

However, there are no modern bit-addressable machines, so if you want smaller than a byte (quite reasonable sometimes) you have to handle that with software.

Further if you want 3-, 5-, 6-, or 7-byte data, the hardware doesn't necessarily provide that directly — though support for misaligned load helps, since with that you can load a larger size and mask off the bad pieces; stores similar with read-modify-write.

If you want 9-byte or larger, you'll have to use multiple load and store instruction, though again misaligned capabilities in the hardware help with odd sizes.

Some instruction sets have drawn a limited line by removing byte load & store instructions (while remaining byte addressable) though provided dedicated instructions to extract the proper byte from a word in a register so as to still provide some hardware acceleration for byte operations on hardware that doesn't have misaligned loads, since without either byte loads, misaligned capabilities, or special helper instructions, extracting the proper byte from a word can take multiple instructions and/or repetitive loading of the same memory word for sequential access.

I advocate the load/store model. That means rich load & store instructions:, load signed byte, load unsigned byte, signed half, unsigned half, word (32), double. And arithmetic in such a model would be register to register, so then you don't need smaller than word-sized addition. No mainstream programming language demands byte addition, and having byte arithmetic doesn't even offer optimization in a load/store architecture.

However, you will want to take the architecture as a whole into account in designing the individual instructions.

How do I deal with numbers less than 32 bit in a 32 bit system?

Answers (1)

Related Questions