Erik Nyquist
Erik Nyquist

Reputation: 1317

C: recursively opening sub-directories and creating new files

I'm writing something that recursively finds .c and .h files and deletes all comments (just as a learning excercise). For every .c/.h file found, this program creates an additional file which is equal to the original file without the comments. So for example, "helloworld.c" would result in an additional file "__helloworld.c"

The problem I am encountering is this:

I have a loop which iterates over all entries in a directory, and keeps going until it stops finding files with .c or .h extensions. However, the loop never actually ends, since each time a file is found, another is created. So I have this recursive situation where "__helloworld.c" becomes "____helloworld.c" which becomes "______helloworld.c", etc. (in case anyone suggests, yes it is necessary for the new files to have a .c extension.) One possible solution may be to keep track of the inode numbers so we know only to iterate over original files, however this requires several iterations of the loop: once to count directory entries, (and use this number to initialise array for inode nums), twice to store inode numbers, and finally a third time to do the work.

Can anybody share any ideas that could achieve this in a single pass of the loop? code is split across two files so I have posted the main recursive routine:

consume_comments(): takes single file as argument, creates new file with comments omitted

My main routine pretty much just does some argument handling- the routine posted below is where the real problems are.

/*
opens a directory stream of the dir pointed to by 'filename',
looks for .c .h files, consumes comments. If 'rc' == 1, find()
calls itself when it encounters a sub-directory.
*/
int find (const char * dirname)
{
        int count = 3;
        DIR * dh;
        struct dirent * dent;
        struct stat buf;
        const char * fnext;
        int filecount = 0;

        chdir(dirname);

        if ((dh = opendir(".")) == NULL)
        {
                printf("Error opening directory \"%s\"\n", dirname);
                exit(-1);
        }

        while ((dent = readdir(dh)) != NULL)
        {
                if (count) count--;
                if (!count)
                {
                        if (lstat(dent->d_name, &buf) == -1)
                        {
                                printf("Error opening file \"%s\" for lstat()\n", dent->d_name);
                                exit(EXIT_FAILURE);
                        }

                        if (S_ISDIR(buf.st_mode) && rc)
                        {
                                find(dent->d_name);
                                chdir("..");
                                //when this find() completes, it will be one level down:
                                //so we must come back up again.
                        }
                        if (S_ISREG(buf.st_mode))
                        {
                                fnext = fnextension(dent->d_name);
                                if (*fnext == 'c' || *fnext == 'h')
                                {
                                        consume_comments(dent->d_name);
                                        printf("Comments consumed:%20s\n", dent->d_name);
                                }
                        }
                }
        }
}

Upvotes: 4

Views: 306

Answers (3)

Erik Nyquist
Erik Nyquist

Reputation: 1317

New implementation, using a routine chk_prefix() to match the prefix of filenames.

char * prefix = "__nmc_";

int chk_prefix (char * name)
{
        int nsize = strlen(name);
        int fsize = strlen(prefix);
        int i;
        if (nsize < fsize) return 1;
        for (i = 0; i < fsize; i++)
        {
                if (name[i] != prefix[i]) return 1;
        }
        return 0;
}

int find (const char * dirname)
{
        int count = 3;
        DIR * dh;
        struct dirent * dent;
        struct stat buf;
        const char * fnext;
        int filecount = 0;

        chdir(dirname);

        if ((dh = opendir(".")) == NULL)
        {
                printf("Error opening directory \"%s\"\n", dirname);
                exit(-1);
        }

        while ((dent = readdir(dh)) != NULL)
        {
                if (count) count--;
                if (!count)
                {
                        if (lstat(dent->d_name, &buf) == -1)
                        {
                                printf("Error opening file \"%s\" for lstat()\n", dent->d_name);
                                exit(EXIT_FAILURE);
                        }

                        if (S_ISDIR(buf.st_mode) && rc)
                        {
                                find(dent->d_name);
                                chdir("..");
                                //when this find() completes, it will be one level down:
                                //so we must come back up again.
                        }
                        if (S_ISREG(buf.st_mode))
                        {
                                fnext = fnextension(dent->d_name);
                                if (*fnext == 'c' || *fnext == 'h' && chk_prefix(dent->d_name))
                                {
                                        consume_comments(dent->d_name);
                                        printf("Comments consumed:%20s\n", dent->d_name);
                                }
                        }
                }
        }
}

Upvotes: 1

Theolodis
Theolodis

Reputation: 5102

I do see multiple solutions to your problem. But in any case you might need to check if the file you are going to create does already exist or not! Otherwise you could run into cases where you do override existing files!

(Example: file.c, __file.c in your directory, you check the file __file.c and generate the file ____file.c, then you check the file file.c and override the file __file.c)

  1. Ignore files that do begin with your chosen prefix.

    advantages: easy to implement

    downsides: you might miss some files starting with your prefix

  2. while going through all the directory you make a set of unique filenames you have already created. Before converting any file you check if this file has been created by yourself.

    advantages: you don't miss files that begin with your prefix

    disadvantages: if you do have a very long list of files the memory usage might explode.

edit: the second and third solution of Mohit Jain look pretty good too!

Upvotes: 1

Mohit Jain
Mohit Jain

Reputation: 30489

You can use 1 of the 3 solutions

  1. As suggested in comment by @Theolodis, ignore files starting with __.
  2. Split your algorithm into 2 parts. In first part prepare a list of all the .c and .h files(recursive). In second step, go through the list and generated stripped versions of files(non-recursive).
  3. Prepare the stripped .c and .h files in some temp directory (/tmp in linux or %TEMP% in windows) and move it to folder once all the .c and .h files of the folders have been processed. Now scan all the sub-folders.

Upvotes: 4

Related Questions