Mike A.
Mike A.

Reputation: 175

DOSBox Debug Assembly

I am running DOSBox using the debug utility to build assembly code. I am just trying to figure out how to read in a string. I have this so far.

-n test.com
-a
072A:0100 db 15
072A:0101 db 16
072A:0102 mov dx, 100
072A:0105 mov ah, 0A
072A:0107 int 21
072A:0109 int 20

-rcx 15
-w
-q

So I can read in a string that is 15 or less characters from the buffer. When I read in the string, where does it get placed? I want to try and print the string back out.

Upvotes: 0

Views: 4198

Answers (1)

byteptr
byteptr

Reputation: 1385

The code in your sample has 2 problems.

You don't want to place your raw data at offset 100 because DOS is going to begin execution starting at this address. Because of this your code beginning at offset 102 won't execute properly because it will assemble to the "ADC AX,BA16" instruction which steals the BA byte from your "mov dx,100" instruction and the rest of the instructions will be misinterpreted.

If you didn't have problem #1, the second problem is that input characters will overwrite your code starting at offset 102 (since the first two bytes of the buffer at 100 are reserved). If you could get the code to execute, this is probably not what you want. ;)

You don't want to define any data at offset 100 as this is the entry point of a .COM program. A simple way to handle this is to define your data after the code. When using something like debug where you have to specify memory locations directly without the use of labels, the size of the assembled operands has to be taken into account.

One method is to just assemble your code starting at address 100 and use placeholders for your string and buffer addresses; then fix those instructions once you've defined the data.

Here is your revised sample:

NOTE: I couldn't help myself and added an output prompt prior to the string input function

-n test2.com
-a
0100: mov dx,100    ;100 is a placeholder for the address of output prompt
0103: mov ah,9
0105: int 21        ;output prompt
0107: mov dx,100    ;100 is a placeholder for the address of the input buffer
010A: mov ah,0a
010C: int 21        ;read user input
010E: int 20
0110: db 48         ;"H"
0111: db 65         ;"e"
0112: db 6c         ;"l"
0113: db 6c         ;"l"
0114: db 6f         ;"o"
0115: db 3a         ;":"
0116: db 20         ;<space>
0117: db 24         ;"$" / DOS strings terminated by dollar sign
0118: db 0f         ;buffer member: hold 15 chars
0119: db 00         ;buffer member: character input count stored here

;the remainder of the .COM memory segment can be used to store the data
; starting at address 11A

;now that we know where our data fits, lets plug the addresses in

-a 100
0100: mov dx,110    ;address of "H" in prompt string
-a 107
0100: mov dx,118    ;address of input buffer with first two bytes reserved (see above)
                    ;max length to read and characters read

-rcx
CX 0000
:1a
-w
Writing 001A bytes

;lets run the program

-g
Hello: <lets assume you type "cat" followed by RETURN>
Program terminated normally

;now look at the 2nd byte in the buffer for the number of characters typed
;   along with the character data entered

-d 118 11e
0110          0F 03 63 61 74 0D 0D      ..cat..

I don't think you have to worry about reserving space for your buffer once you've defined the max input chars member; since the buffer is at the end of the program, you have access to the remainder of the 64k segment as your buffer space. (64k minus the 0x11b bytes of code and environment space that is).

EDIT 2016/09/11

Here is a revised version (with explanation) to address your questions about outputting what was typed by the user:

0100  MOV     DX,0140
0103  MOV     AH,09
0105  INT     21            ;output prompt
0107  MOV     DX,014B
010A  MOV     AH,0A
010C  INT     21            ;read user input
010E  MOV     DX,0148
0111  MOV     AH,09
0113  INT     21            ;output CRLF
0115  MOV     DI,[014C]     ;load DI with value of characters read
0119  AND     DI,00FF       ;   MOV above read WORD, but we only want lower byte
011D  LEA     DI,[DI+014D]  ;point DI to end of input string
0121  MOV     AL,24         ;load DOS terminator char "$" in AL
0123  STOSB                 ;write it to end input string
0124  MOV     DX,014D
0127  MOV     AH,09
0129  INT     21            ;print the input string
012B  INT     20            ;exit program

;I inserted NOPS until address 0140 so I would have room to insert code without my
;   data offsets changing

0140 DB 48                  ;"Hello: $" prompt begin
0141 DB 65
0142 DB 6C
0143 DB 6C
0144 DB 6F
0145 DB 3A
0146 DB 20
0147 DB 24
0148 DB 0D                  ;CRLF$ sequence begin
0149 DB 0A
014A DB 24
014B DB 20                  ;Buffer begin (32 bytes max length)
014C DB 00

There are multiple ways to achieve the same result, such as loading the CX register with the user input count and using a LOOP instruction to print each character using the appropriate DOS function. Good luck!

Upvotes: 4

Related Questions