Reputation: 633
I am studying computer architecture (MIPS architecture) and read the following statements:
1.Branch instructions have a 16 bit signed word offset field that allows a branch to an address + or -128kBytes (+0x1FFFC TO -0X20000) from the current location.
2.A jump instruction specifies an address within the current 256MByte(0x0FFFFFFC) region specified by Program counter most significant 4 bits.
I understand the concept of jump range described above but how are the three numbers 0x0FFFFFFC, 0x1FFFC and 0X20000 calculated using "the range of 256Mbyte" and "the range of +-128 kbytes"?
Thanks!
Upvotes: 1
Views: 3407
Reputation: 31
The other answers didn't really answer your question of how these Hex values are calculated/found. so here's my answer.
Thinking about this is much easier in Binary than HEX. as the 2bit left shift is important to understanding the concept 2bits is multiply by 4. Which cannot be represented in HEX as nicely since easy Hex digit is 16 values. but ill try to explain it still:
1 Branch instructions use a 16 bit Immediate field. (5 bit RS, RT) (6 bit Opcode) == 32bits (https://en.wikibooks.org/wiki/MIPS_Assembly/Instruction_Formats#I_Format)
those 16 bits are Signed. they can be positive & Negative.
That gives you an effective range of -(2^15) == -32768
to +(2^15 -1) == 32767
MIPS multiples any address inputs by 4. Forcing them to be word aligned.
so your Minimum value -(2^15)
Multiply by 4: -{2^15 *4} (4=2^2), {2^(15+2)} (15+2 == 17)
:
becomes -(2^17) == -131072
in Binary (signed 2's complement).
1000 0000 0000 0000 <<2 == 10 0000 0000 0000 00[00]
Converting that to Hex 10=2 (0000=0) gives 2 0 0 0 0 ==
0x20000
this would be sign extended before adding it to the (PC+4):
so for say, instruction #32770, PC=0x00420008 (PC+4)=0x0042 000C
0x0042000C - 0x20000 = 0x0040000C, instruction #3
(remember, offset is based off PC+4)
#32770+1 +-32768 == 3
Same for the Maximum value:
(2^15 -1)
Multiply by 4: {(2^15 -1) *4} (4=2^2), {2^(15+2) -(1*4)} (15+2 == 17)
:
becomes (2^17 -4) == 131068
0111 1111 1111 1111 <<2 == 01 1111 1111 1111 11[00]
Converting that to Hex 01=1 (1111=F) (1100=C) gives 1 F F F C ==
0x1FFFC
Note the address needs to be added to the current (Program Counter+4)
so for say, instruction #32770, PC=0x00420008 (PC+4)=0x0042000C
0x0042000C + 0x1FFFC= 0x440008, instruction #65538
(remember, offset is based off PC+4)
#32770+1 +32767 == 65538
2 now Jumps use a 28 bit address.
Also Note, Jumps use an absolute address. not an offset.
maximum 28 bit value is (2^26 -1) == 67108863, 0x03FFFFFF ``
Shifted 2 (*4) becoming 28bits. {(2^26 -1) *4}, == {2^28 -4} ==
268435452, 0x0FFFFFFC
But then the missing four bits ? .. they come from the PC - which in the Memory stage, it has already been incremented to (PC+4)
for instruction #32770, PC=0x00420008 (PC+4)=0x0042000C
0x0042000C in binary is [0000] 0000 0100 0010 0000 0000 0000 1100
+0x0FFFFFFC in binary [####] 1111 1111 1111 1111 1111 1111 1100
it is only 28 (27:0) bits and missing the 31:28 bits.
Taking the bits from PC+4. we get:
0000 ---- ---- ---- ---- ---- ---- ---- (PC+4)
---- 1111 1111 1111 1111 1111 1111 1100 (Target-Address)
-----------------------------------------
0000 1111 1111 1111 1111 1111 1111 1100 (Jump-Address)
(which in this case is the same value as sign extending it)
A better explanation of how Addresses are calculated. How to Calculate Jump Target Address and Branch Target Address?
Upvotes: 3
Reputation: 71516
Why dont you just ask a tested and debugged toolchain, then compare that to the documentation?
so.s
four:
nop
nop
nop
j one
nop
j two
nop
j three
nop
j four
nop
nop
nop
nop
nop
one:
nop
two:
nop
nop
three:
nop
build and disassemble
mips-elf-as so.s -o so.o
mips-elf-objdump -D so.o
so.o: file format elf32-bigmips
Disassembly of section .text:
00000000 <four>:
...
8: 0800000f j 3c <one>
c: 00000000 nop
10: 08000010 j 40 <two>
14: 00000000 nop
18: 08000012 j 48 <three>
1c: 00000000 nop
20: 08000000 j 0 <four>
24: 00000000 nop
...
0000003c <one>:
3c: 00000000 nop
00000040 <two>:
...
00000048 <three>:
48: 00000000 nop
link to some address and disassemble
00001000 <_ftext>:
...
1008: 0800040f j 103c <one>
100c: 00000000 nop
1010: 08000410 j 1040 <two>
1014: 00000000 nop
1018: 08000412 j 1048 <three>
101c: 00000000 nop
1020: 08000400 j 1000 <_ftext>
1024: 00000000 nop
...
0000103c <one>:
103c: 00000000 nop
00001040 <two>:
...
00001048 <three>:
1048: 00000000 nop
so jumps are super easy what about branch?
four:
nop
nop
nop
beq $10,$11,one
nop
beq $10,$11,four
nop
nop
nop
one:
nop
assemble and disassemble
00000000 <four>:
...
8: 114b0006 beq $10,$11,24 <one>
c: 00000000 nop
10: 114bfffb beq $10,$11,0 <four>
14: 00000000 nop
...
00000024 <one>:
24: 00000000 nop
Some experience helps here, first going forward 0x24 - 0x8 = 0x1C. These are fixed 32 bit instructions, so unlikely they need to waste the two bits and cut the range, so 0x1C>>2 = 7. The encoding has a 6. Well it is also likely they are thinking in terms of the pc has been incremented, or another way to look at this is 6(+1) instructions ahead. 0xC, 0x10, 0x14, 0x18, 0x1c, 0x20, 0x24. So that would imply going backward is (0x00 - (0x10+4))>>2 = (0x00-0x14)>>2 = 0xFFFF...FFFFEC>>2 = 0xFF...FFFB and sure enough that is what we get.
So for branches you take
((destination - (current address + 4))/4)&0xFFFF =
(((destination - current address)/4) + 1)&0xFFFF
For jumps immediate = {pc[31:28],destination[28:2]}
You should be able to figure out the ranges from that information.
The key to the encoding being the instructions are fixed at 32 bits and aligned on 32 bit boundaries so the two lsbits are always zeros along with the math associated with them, so why cut your range down by 4 to store zeros? You dont, you efficiently pack the offsets into the immediate. Some (fixed length) instruction sets dont do that but generally have a reason not to as part of the design.
In general a debugged assembler if you have access to one is going to provide more useful information than an instruction set reference, this is based on experience learning many instruction sets. If you are the first one to write an assembler for some processor then that means you work there or have direct access to the designers of the processor and you can simply ask them the math, rather than rely on the not yet written manual, which they will write after the chip has taped out, whichis too late as you/they need the assembler to validate the design. So emails, skypes, and most important whiteboard discussions of the instruction encoding. You might also have access to the chip source code and/or a simulator so you can run your code, see it execute in the sim (examine the waveforms) and see where it branches to (where it fetches), change the immediate, look at where it fetches.
Basically you should in general always have access to a resource with the answer that can help explain a manual lacking some detail. Granted sometimes you get a good manual...(and you should still verify that with the resource).
Upvotes: 1