B.Ing
B.Ing

Reputation: 57

Removing comments from a datafile. What are the differences?

Let's say that you would like to remove comments from a datafile using one of two methods:

cat file.dat | sed -e "s/\#.*//"
cat file.dat | grep -v "#"

How do these individual methods work, and what is the difference between them? Would it also be possible for a person to write the clean data to a new file, while avoiding any possible warnings or error messages to end up in that datafile? If so, how would you go about doing this?

Upvotes: 0

Views: 53

Answers (2)

James Brown
James Brown

Reputation: 37404

That grep -v will lose all the lines that have # on them, for example:

$ cat file
first
# second
thi # rd

so

$ grep -v "#" file
first

will drop off all lines with # on it which is not favorable. Rather you should:

$ grep -o "^[^#]*" file
first
thi 

like that sed command does but this way you won't get empty lines. man grep:

   -o, --only-matching
          Print  only  the  matched  (non-empty) parts of a matching line,
          with each such part on a separate output line.

Upvotes: 0

RavinderSingh13
RavinderSingh13

Reputation: 133508

How do these individual methods work, and what is the difference between them?

Yes, they work same though sed and grep are 2 different commands. Your sed command simply substitutes all those lines which having # with NULL. On other hand grep will simply skip or ignore those lines which will skip lines which have # in it.

You could get more information on these by man page as follows:

man grep:

   -v, --invert-match
          Invert the sense of matching, to select non-matching lines.  (-v is specified by POSIX.)

man sed:

   s/regexp/replacement/
          Attempt to match regexp against the pattern space.  If successful, replace that portion matched with replacement.   The 

replacement may contain the special character & to refer to that portion of the pattern space which matched, and the special escapes \1 through \9 to refer to the corresponding matching sub-expressions in the regexp.



Would it also be possible for a person to write the clean data to a new file, while avoiding any possible warnings or error messages to end up in that datafile?

yes, we could re-direct the errors by using 2>/dev/null in both the commands.

If so, how would you go about doing this?

You could try like 2>/dev/null 1>output_file

Explanation of sed command: Adding explanation of sed command too now. This is only for understanding purposes and no need to use cat and then use sed you could use sed -e "s/\#.*//" Input_file instead.

sed -e "  ##Initiating sed command here with adding the script to the commands to be executed
s/        ##using s for substitution of regexp following it.
\#.*      ##telling sed to match a line if it has # till everything here.
//"       ##If match found for above regexp then substitute it with NULL.

Upvotes: 2

Related Questions