Reputation: 1107
I need to think in the cleanest possible way to implement the functionality of the #include
directive for a C compiler.
I only know how implement the external part of the processing: Get the '#'
char at the beginning of the line to run a preprocessor-only loop, and I also know how to gather the "include"
string and the string between <>
or ""
.
What I don't know is the best way to implement the internal processing to run the actual effect of the #include directive: Expand the full path for library header files (using <>
) but not for the ones using ""
(it's probably cleaner and more flexible to assume that they are in the current directory as that would also allow for including source files with the full path correctly).
The tasks I think I would need to implement would be:
The main C file passed as a command-line parameter to the compiler should be processed just like a #include "mainfile.c"
directive to start the compilation in an uniform way.
Expand the path for files included with quotes (""
, are single quotes ''
valid at least for some compilers?)
Put the file in a list of files, also indicating in which line and in which file we found the #include
directive
In the preprocessor stage, see if it's an #include
directive and try to open the specified file unconditionally to try to properly get all files from the start. If a file doesn't exist, don't signal an error at the preprocessor stage, only when we have marked them as usable, when we determine whether we should include them or not due to #ifdef
or #elif
, while trying to translate the actual C code.
After finishing to process all #includes
in the code, process the rest of the preprocessor code now with the full set of potential files to include.
I think that using a stack of files would be useful but only after completing the preprocessor stage, and when we are already translating the whole code and adding files (pushing file indexes on the source file stack at #include
and popping file indexes at the end of a source file.)
I think that the easiest way to handle the code would be to inspect all of the files pointed by #include
, make a list of them, and then later only mark as usable, the ones that I will actually include and process fully, those that meet #ifdef
or #elif
conditions, but for that I need to see which included files there are in the whole set of source files.
Upvotes: 3
Views: 300
Reputation: 1107
It seems that the preprocessor code needs to be parsed in order to properly know whether we have already performed tasks like defining stuff or including files to avoid doing so again, so it really needs to be parsed as we find the preprocessor directives in the order in which we find it.
The actual C code can probably be analyzed in any order by doing several passes, mostly to the globally declared stuff, for being able to use things before declaring them, but the preprocessor needs to be processed in order for being able to selectively define and include things.
Upvotes: 0
Reputation: 126253
You usually process all preprocessor directives as you read them. So when you see an #include
, you get the file name, search through the include path, open the file and start processing it -- no need to defer things. Once you get to the end of the included file, you continue processing the original file.
Similarly with an #if
, you read the condition and decide if it is true or false. If false, you then start skipping over the input, ignoring it until you find the matching #else
or #endif
. So if there's an #include
in there, you just skip it.
Upvotes: 2