Reputation: 163
In nasm, when I type
bits 32
org 1
jmp mylabel
mylabel:
The org directive offsets all the label's addresses by 1. However, when I do this in GAS:
.org 1
jmp mylabel
mylabel:
I get a file where the label addresses are the same as if the program didn't have an org, but there is 1 leading 0 in the compiled file. Is there a directive in GAS that behaves like the org from nasm?
Upvotes: 3
Views: 2965
Reputation: 39581
The GNU assembler doesn't have a directive that's equivalent to NASM's ORG directive. The GNU assembler's .ORG directive works more like MASM's ORG directive, which is probably where both the NASM ans GAS directives were modelled after.
NASM's ORG directive is much more restricted than either the GAS or MASMs directives. As Frank Kotler said, it only works with the the "bin" output format and can only be used once in a source file. From the NASM Manual:
Unlike the ORG directive provided by MASM-compatible assemblers, which allows you to jump around in the object file and overwrite code you have already generated, NASM's ORG does exactly what the directive says: origin. Its sole function is to specify one offset which is added to all internal address references within the section; it does not permit any of the trickery that MASM's version does.
The GNU assembler .ORG directive doesn't allow the "trickery" that MASM's ORG directive does. You can't move the origin backwards and overwrite already generated code. GAS does however allow to you use it multiple times, and most importantly it works with object file formats like ELF and PECOFF. There's no way to implement the behaviour of NASM's ORG directive with these object file formats as there's no way to say that a section should be loaded at a specific address.
As dwelch said, the ORG directive, regardless of what assembler you're using, are only meant to be used in single file assembly projects. NASM forces this because it only works with the "bin" output format, which can't be linked. With GAS and MASM, the .ORG/ORG directives are only relative to the start of the section/segment in the object file. This means if you want these directives to set an absolute address in the linked image, the section with the directive must be the first or only section and the section must be begin at address 0.
To get the behaviour you want with the GNU assembler and linker you need two things. First you want generated binary image to work when loaded at the absolute address given by the ORG directive. This means that any absolute memory references need use the location in memory where referred location is loaded into memory, not where the referred location is in the binary file. These two locations are different because of your second requirement. Your second requirement is that the binary file start at the first location in your code, not at address 0.
To show you how you can do this with the GNU assembler and linker, I'm going to use a more realistic example of creating an MS-DOS .COM file. COM files are simple binary files. There's no headers or other information stored in the file like with other executable formats, just the raw binary image. The file is loaded into a single 16-bit segment, starting at offset 0x100. So this is just like with your NASM example, the first byte in the file isn't supposed to be loaded at address 0. In this case its loaded at address 0x100.
So here's a simple MS-DOS "Hello, World!" program, written in GNU assembly:
.code16
.text
mov $msg,%dx
mov $9,%ah
int $0x21
mov $0x4c00,%ax
int $0x21
msg:
.ascii "Hello, world!$"
Notice, that there's no .ORG directive in the source code example above. Turns out it doesn't help with creating a binary file that isn't loaded at address 0. It can be assembled normally, but to link it correctly you need to use the -Ttext=
option as mentioned by dwelch:
as -o hello.o hello.s
ld -Ttext=0x100 --oformat binary -o hello.com hello.o
Note that the above commands won't work with the Windows PECOFF versions of the GNU assembler and linker. You'll need to run these command on Linux or some other machine that uses the ELF object file format.
You can see that the linker generated the COM correctly with the following commands:
$ hd hello.com
00000000 ba 0c 01 b4 09 cd 21 b8 00 4c cd 21 48 65 6c 6c |......!..L.!Hell|
00000010 6f 2c 20 77 6f 72 6c 64 21 24 |o, world!$|
0000001a
$ objdump -b binary -m i8086 --adjust-vma=0x100 -D hello.com
...
00000100 <.data>:
100: ba 0c 01 mov $0x10c,%dx
103: b4 09 mov $0x9,%ah
105: cd 21 int $0x21
107: b8 00 4c mov $0x4c00,%ax
10a: cd 21 int $0x21
10c: 48 dec %ax
10d: 65 gs
...
The first byte in the file is the mov $msg,%dx
instruction, as shown by hd
. There's no extra bytes padding the start of the COM file. The objdump
disassembler output shows that absolute memory reference to the symbol msg
has been correctly resolved. It points to the address where the string will be loaded into memory (0x010c
) and not to location of the string in the file (0x000c
).
For a more complicated example that linked multiple files together or used multiple sections you'll probably need to use a linker script rather than -Ttext=
option.
Upvotes: 4