J.Clark
J.Clark

Reputation: 175

Deletion of comments and more in a C source file

I want to design a pipe of sed and awk to remove all comments as well as blank lines, add line numbers to the C source file and save the output to new_example.c. So far, the only thing I have been able to accomplish is 's/[/**/]//g', which only removes the "/" and the "/" and not the text in between.

//this is a comment
#include <stdio.h>
#include <stdlib.h>
/* this is the main program
remove this line
and this line
*/

int main(int argc, char *argv[])
{
    //this is another comment
    char *path;
    int numbers[10];
    int *a1;
    a1= malloc(10*sizeof(int));

    float *a2;
    a2 = malloc(10*sizeof(float));

    a1[2] = 10;
    a2[4] = 3.14;
    free(a1 );
    free(a2);

    return 0;
}

Upvotes: 0

Views: 62

Answers (2)

Akshay Hegde
Akshay Hegde

Reputation: 16997

This may help you too

gcc -fpreprocessed -E   test.c | sed '/^\s*$/d'

gcc -fpreprocessed -E test.c - To Remove comments

sed '/^\s*$/d' - To Remove blank lines

Test Input file

[akshay@localhost tmp]$ cat test.c
//this is a comment
#include <stdio.h>
#include <stdlib.h>
/* this is the main program
remove this line
and this line
*/

int main(int argc, char *argv[])
{
    //this is another comment
    char *path;
    int numbers[10];
    int *a1;
    a1= malloc(10*sizeof(int)); // here is comment

    float /*comment*/ *a2;
    a2 = malloc(10*sizeof(float)); /* comment*/

    a1[2] = 10;
    a2[4] = 3.14;
    free(a1 );
    free(a2);

    return 0;
}

Output

[akshay@localhost tmp]$ gcc -fpreprocessed -E   test.c | sed '/^\s*$/d'
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
    char *path;
    int numbers[10];
    int *a1;
    a1= malloc(10*sizeof(int));
    float *a2;
    a2 = malloc(10*sizeof(float));
    a1[2] = 10;
    a2[4] = 3.14;
    free(a1 );
    free(a2);
    return 0;
}

Upvotes: 0

Ed Morton
Ed Morton

Reputation: 204731

You can't do it without a language parser. Do not waste your time trying some sed or awk or whatever scripting hack - it WILL fail for some cases even if you can't figure out what they are right now.

Something like this will do what you want, using gcc to parse the C:

$ sed 's/a/aA/g; s/__/aB/g; s/#/aC/g' file.c |
        gcc -P -E - |
        sed 's/aC/#/g; s/aB/__/g; s/aA/a/g' |
        cat -n
 1  #include <stdio.h>
 2  #include <stdlib.h>
 3  int main(int argc, char *argv[])
 4  {
 5      char *path;
 6      int numbers[10];
 7      int *a1;
 8      a1= malloc(10*sizeof(int));
 9      float *a2;
10      a2 = malloc(10*sizeof(float));
11      a1[2] = 10;
12      a2[4] = 3.14;
13      free(a1 );
14      free(a2);
15      return 0;
16  }

The sed scripts around the gcc are to hide all __s and #s from gcc so it doesn't expand constructs like #include and __FILENAME__.

Add arguments such as -ansi to the gcc for whichever C standard you are using if it doesn't parse your flavor of C to your liking by default.

Upvotes: 2

Related Questions