x-x
x-x

Reputation: 7515

Why does my C/C++ header parser not work?

I want to bundle up the Boost Preprocessor library (and eventually others) into a single amalgamated header so I threw together a small utility program to achieve this goal... only it's not working! I'm unable to determine whether it's a bug or implementation problem (or both) in my program that's causing it to not work correctly.

The program is supposed to open the boost\preprocessor\library.hpp header (which includes the entire library) and recursively output to stdout all the headers the library needs. Windows Explorer reports there are (as of Boost Preprocessor v1.59.0) 270 header files in the directory tree, but my program is only parsing 204.

I test the amalgamated header by using it in another project that uses Boost Preprocessor. When using the boost\preprocessor\library.hpp header the project compiles and works fine, but when using my amalgamated version compilation fails due to not finding all the Boost Preprocessor macros needed.

The full, compilable, code: (Only tested with MSVC v19)

#include <assert.h>
#include <ctype.h>
#include <stdio.h>
#include <string.h>
#include <string>
#include <unordered_map>

// Remember what header files have already been parsed.
std::unordered_map<std::string, bool> have_parsed;

char include_dir[FILENAME_MAX]; // Passed in from the command line.

// Advance given pointer to next non-whitespace character and return it.
char* find_next_nonwhitespace_char(char* start) {
    while(isspace((int)(*start)) != 0) start++;
    return start;
}

#define DIE(condition, str) if(condition) {perror(str); exit(EXIT_FAILURE);}

int headers_parsed = 0;

void parse_header(const char* filename) {
    headers_parsed++;
    char path[FILENAME_MAX];
    strcpy(path, include_dir);
    strcat(path, filename);

    // Open file, get size and slurp it up.
    FILE* file = fopen(path, "rb");
    DIE(file == NULL, "fopen()");
    fseek(file, 0L, SEEK_END);
    long int file_size = ftell(file);
    rewind(file);
    char* file_buffer = (char*)malloc(file_size+1); // +1 for extra '\0'
    DIE(file_buffer == NULL, "malloc()");
    size_t got = fread(file_buffer, 1, file_size, file);
    DIE(got != file_size, "fread()");
    fclose(file);

    char* read_index = file_buffer;
    char* end_of_file = file_buffer + file_size;
    *end_of_file = '\0';

    // File is now in memory, parse each line.
    while(read_index < end_of_file) {
        char* start_of_line = read_index;
        // Scan forward looking for newline or 'EOF'
        char* end_of_line = strchr(start_of_line, '\n');
        if(end_of_line == NULL) end_of_line = end_of_file;
        *end_of_line = '\0';
        // Advance to the start of the next line for the next read.
        read_index += (end_of_line - start_of_line) + 1;
        // Look for #include directive at the start of the line.
        char* first_char = find_next_nonwhitespace_char(start_of_line);
        if(*first_char == '#') {
            // This could be an include line...
            char* next_char = find_next_nonwhitespace_char(first_char + 1);
            const char include[] = "include ";
            if(strncmp(next_char, include, strlen(include)) == 0) {
                char* open_brace = find_next_nonwhitespace_char(next_char + strlen(include));
                if(*open_brace++ == '<') {
                    char* close_brace = strchr(open_brace, '>');
                    assert(close_brace != NULL);
                    *close_brace = '\0';
                    if(have_parsed[open_brace] == false) {
                        have_parsed[open_brace] = true;
                        parse_header(open_brace); // Recurse
                    }
                    continue;
                }
            }
        }
        fprintf(stdout, "%s\n", start_of_line);
    }
    free(file_buffer);
}

int main(int argc, char* argv[]) {
    if(argc < 3) {
        fprintf(stderr, "%s {include directory} {first header}\n", argv[0]);
        return EXIT_FAILURE;
    }

    // Ensure the include directory has trailing slash
    strcpy(include_dir, argv[1]);
    size_t len = strlen(argv[1]);
    if(include_dir[len-1] != '\\') {
        include_dir[len] = '\\';
        include_dir[len+1] = '\0';
    }

    parse_header(argv[2]);

    fprintf(stderr, "headers parsed: %d\n", headers_parsed);
    return EXIT_SUCCESS;
}

Running the compiled program: (Boost is installed in g:\dev)

g:\dev\amalgamate\amalgamate.exe g:\dev\ boost\preprocessor\library.hpp > boost_pp.h
headers parsed: 204

And the generated boost_pp.h header: https://copy.com/vL6xdtScLogqnv9z

What's wrong? Why is my program not creating a working, amalgamated header?

Upvotes: 1

Views: 549

Answers (1)

Adrian17
Adrian17

Reputation: 443

Some of the headers in the tree aren't actually included by library.hpp either because:

  • they wrap other headers as an external interface (or because they are deprecated), for example preprocessor/comma.hpp simply includes preprocessor/punctuation/comma.hpp,

  • are to be included via a macro, for example:

    # define BOOST_PP_ITERATE() BOOST_PP_CAT(BOOST_PP_ITERATE_, BOOST_PP_INC(BOOST_PP_ITERATION_DEPTH()))
    #
    # define BOOST_PP_ITERATE_1 <boost/preprocessor/iteration/detail/iter/forward1.hpp>
    # define BOOST_PP_ITERATE_2 <boost/preprocessor/iteration/detail/iter/forward2.hpp>
    # define BOOST_PP_ITERATE_3 <boost/preprocessor/iteration/detail/iter/forward3.hpp>
    # define BOOST_PP_ITERATE_4 <boost/preprocessor/iteration/detail/iter/forward4.hpp>
    # define BOOST_PP_ITERATE_5 <boost/preprocessor/iteration/detail/iter/forward5.hpp>
    

Which can be used with:

#define BOOST_PP_ITERATION_PARAMS_1 (...)
#include BOOST_PP_ITERATE()

Upvotes: 5

Related Questions