KSquared
KSquared

Reputation: 58

Pointer to .text in real mode assembly

There must be something fundamental I'm misunderstanding about real mode addressing. I am trying to set up a function to print text via BIOS interrupt in real mode. I am testing the code using a .com file executed under DOSBox. The .text section ends up at 0x1000 (0x0F00 in the .com file). So lets say I want to print the first letter of that text.

xor ebx, ebx
mov ecx, 1
mov ah, 10
mov al, ds:[0x1000]
int 0x10

That works, and prints out 'H', because I have no imagination. But then I don't want it to just print out the same letter. I want to pass in a pointer, and I want to increment that pointer as I'm printing out more text. At this stage, I'm happy enough just reading offset from the register. So I make the following change.

mov edx, 0x1000
mov al, ds:[edx]

And no character gets printed. I've tried using esi and edi registers, same result. Using lea edx, byte ptr [0x1000] produces same result. Worse, trying to use 16 bit equivalents (dx, si, di) results in program hanging up. I've tried looking through the machine code int he .com file, and I can't find anything obviously wrong.

I am compiling the code with gcc using a custom linker script and an objcopy call to make a .com file. No libraries are linked, and target architecture is 386.

Any help would be much appreciated.

Edit: Full listing.

directio.s

.intel_syntax
.global _printChar

_printChar:
    push ebp;
    mov ebp, esp;

    xor edx, edx;
    xor ebx, ebx;
    xor eax, eax;
    mov ecx, 1;

    mov ah, 10;
    mov edx, 0x1000;
    mov al, ds:[edx];
    int 0x10;

    mov esp, ebp
    pop ebp;
    ret;

dirTest.c

asm
(
    ".code16gcc;\n" \
    "call _dosmain;\n" \
    "mov ah, 0x4C;\n" \
    "int 0x21;\n"
);

#include "directio.h"

int dosmain(void)
{
    printChar("Hello World!");
    return 0;
}

com_mingw.ld

SECTIONS
{
    . = 0x0100;
    .text :
    {
        *(.text);
    }
    .data :
    {
        *(.data);
        *(.bss);
        *(.rodata);
    }
    _heap = ALIGN(4);
}

All of this compiles with the following command line.

gcc -std=gnu99 -Os -nostdlib -m32 -masm=intel -march=i386 -ffreestanding -o dirTest.com -Wl,--nmagic,--script=com_mingw.ld dirTest.c directio.s

Followed by

objcopy dirTest.com -O binary

Upvotes: 1

Views: 320

Answers (2)

Keith Marshall
Keith Marshall

Reputation: 2034

It must be about three decades since I last wrote any real mode code, in .com image format, but here are a few observations:

  • It is unusual to define multiple sections, when writing .com format code. The format permits only one segment, with a maximum initial image size of 64Kb; conventionally, this may comprise logical sections, often called "CODE" and "DATA", or "TEXT" and "DATA", but they must be grouped into a single physical segment; any explicit segment, of the "STACK" class, is prohibited.
  • MS-DOS is a 16-bit operating system, so you must write code using only the 16-bit registers, (or their 8-bit sub-registers).
  • The 16-bit processors did not allow you to use any registers, other than BX, SI, DI, and BP, as base registers for indirect memory addressing; you cannot use DX, (as you do), CX, or AX for this purpose; (you likely can use SP, but you normally leave that alone, reserving it for its intended use as the stack pointer).
  • When the operating system loads a .com format process image, it first allocates an environment block, then the program segment following it. The first 256 bytes of the program segment is then filled in with administrative data, (creating what is commonly referred to as the Program Segment Prefix, or PSP), and the .com image is loaded immediately thereafter, starting at address 0x100, (the 0x1000, to which you allude, is just an arbitrary offset within the program segment, at which (presumably) your data is defined, but there is nothing sacrosanct about it -- program code could just as easily appear there.
  • Once the image has been loaded, the operating system sets all four segment registers to the start of the program segment, and sets CS:IP (by performing a FAR JMP, or possibly a FAR CALL), to begin execution of your process at the appropriate entry address, which is always at offset 0x100 (not related in any way to your 0x1000 data address) within the program segment.

Upvotes: -1

rkhb
rkhb

Reputation: 14409

_printChar should be a 16-bit function, so don't assemble it as 32-bit. Add a .code16gcc to the top of the .s-file and change 32-bit registers to 16-bit:

.code16gcc
.intel_syntax
.global _printChar

_printChar:
    push bp;
    mov bp, sp

    xor dx, dx
    xor bx, bx
    xor ax, ax
    mov cx, 1

    mov ah, 10
    mov dx, 0x1000
    mov al, ds:[dx]     ; ERROR! See comments.
    int 0x10

    mov sp, bp
    pop bp
    ret

Now, it should (hopefully) work.

Upvotes: 1

Related Questions