Reputation: 141
I have programmed an assembler with a preprocessor for the MOS 6502 microprocessor. The assembler spits out the correct binary and the preprocessor performs constant substitution, inclusions and conditional inclusions. The problem is retaining file positions of the included files. At this point the preprocessor emits a file directive just before and after a file is included. Here is an example.
Proggie.asm
JSR init
JSR loop
JSR end
%include "Init.asm"
%include "Loop.asm"
%include "End.asm"
Init.asm
init:
LDX #$00
RTS
Loop.asm
loop:
INX
CPX #$05
BNE loop
RTS
End.asm
end:
BRK
Pre Processor Result
%file "D:\Proggie.asm" 1
JSR init
JSR loop
JSR end
%file "D:\Init.asm" 1
init:
LDX #$00
RTS%file "D:\Init.asm" 2
%file "D:\Loop.asm" 1
loop:
INX
CPX #$05
BNE loop
RTS%file "D:\Loop.asm" 2
%file "D:\End.asm" 1
end:
BRK%file "D:\End.asm" 2
%file "D:\Proggie.asm" 2
This idea comes from the output the preprocessor from GCC produces. The %file directive tells the lexical analyzer that a file has just been entered or exited. The number after the file path says if the analyzer enters or exits the given file respectively. My lexical analyzer kind of works with this. It is still a bit of when telling the current line number.
So my question is: Is this the way to go? Or is there another algorithm I could use?
Upvotes: 0
Views: 85
Reputation: 241731
Gcc's preprocessor fabricates line control directives which look like this:
# 122 "/usr/include/x86_64-linux-gnu/bits/types.h" 2 3 4
Here, the 122
is the line number in the file /usr/include/x86_64-linux-gnu/bits/types.h
. Including the line number means that a downstream lexer doesn't need to track the include stack in order to tell which line it is on.
The rest of the line are flags, which are similar to your approach with the addition of a couple of gcc-specific flags:
These allow the downstream lexer to track the include stack if it wishes, and the gcc lexer does so in order to produce more informative (or at least more wordy) error messages.
I think the logic is easier with the preprocessor maintaining the stack, but it doesn't make a huge amount of difference, particularly if you're also going to want to generate "included from" notes in your error messages.
Upvotes: 1