Worvast
Worvast

Reputation: 279

Sed script for delete C++ comments: script duplicate the lines

Im using this script for delete all C/C++ comments with sed:

http://sed.sourceforge.net/grabbag/scripts/remccoms3.sed

sed -i -f remccoms3.sed Myfile.cpp

But this script duplicate all the lines, example:

/*-------------------------------------------------------------------------------
This file is part of MyProject.
Author Worvast 
#-------------------------------------------------------------------------------*/

#include <fstream>
#include <sstream>

//Other files
#include "Data.h"
#include "utility.h"

  // Open input file
  std::ifstream input_file;

Its converted to:

#include <fstream>
#include <fstream>
#include <sstream>
#include <sstream>


#include "Data.h"
#include "Data.h"
#include "utility.h"
#include "utility.h"


  std::ifstream input_file;
  std::ifstream input_file;

And to be honest I do not understand SED both to understand where is the error. Any idea or solution to this problem?

Upvotes: 0

Views: 370

Answers (2)

Ed Morton
Ed Morton

Reputation: 203665

Don't use that script. Its cute and some people may find it interesting as a mental exercise but it's buggy (as it says itself it's been bugfixed to some extent!), absurdly complicated and a completely inappropriate application for sed.

To remove comments from all versions of C or C++ code just use the script at https://stackoverflow.com/a/13062682/1745001 and pass the appropriate C or C++ version to gcc as one of it's arguments.

Also if you want to retain blank lines instead of having them removed (I first wrote this tool for counting NCSL so removing blank lines was desirable) along with the comments then just tweak the sed to make them not look like blank lines to gcc:

$ cat decomment.sh
[ $# -eq 2 ] && arg="$1" || arg=""
eval file="\$$#"
sed 's/a/aA/g;s/__/aB/g;s/#/aC/g;s/^[[:space:]]*$/aD/' "$file" |
          gcc -P -E $arg - |
          sed 's/aD//;s/aC/#/g;s/aB/__/g;s/aA/a/g'

$ ./decomment.sh file

#include <fstream>
#include <sstream>

#include "Data.h"
#include "utility.h"

  std::ifstream input_file;

or if you have an ANSI C version input file where comments cannot start with //, just tell the tool that:

$ ./decomment.sh -ansi file

#include <fstream>
#include <sstream>

//Other files
#include "Data.h"
#include "utility.h"

  // Open input file
  std::ifstream input_file;

Here's an example of a C construct (the trigraph ??/ means \) that the enormous sed script won't handle correctly but the small sed+gcc script will handle just fine because gcc includes a parser for the language, not a bunch of regexp estimations for it:

$ cat tst.c
//C hello world example
#include <stdio.h>

/??/
* This is a comment using trigraphs */

int main()
{
    printf("Hello world\n");
    return 0;
}

.

$ ./remccoms3.sed tst.c

#include <stdio.h>

/??/
* This is a comment using trigraphs */

int main()
{
    printf("Hello world\n");
    return 0;
}

.

$ ./decomment.sh -trigraphs tst.c
#include <stdio.h>


int main()
{
    printf("Hello world\n");
    return 0;
}

Upvotes: 0

Etan Reisner
Etan Reisner

Reputation: 80931

The intended command line to run that sed script is /bin/sed -nf (from the shebang line).

Your command (sed -i -f remccoms3.sed) leaves out the -n argument.

The -n argument to sed is

-n, --quiet, --silent

suppress automatic printing of pattern space

so without that you get the normal printing and the script's printing.

Upvotes: 2

Related Questions