cyclone998
cyclone998

Reputation: 43

Filling empty spaces in a CSV file

I have a CSV file where some columns are empty such as

oski14,safe,0,13,53,4
oski15,Unknow,,,,0
oski16,Unknow,,,,0
oski17,Unknow,,,,0
oski18,unsafe,0.55,,1,2
oski19,unsafe,0.12,4,,56

How do I replace all the empty columns with the word "empty". I have tried using awk(which is a command I am learning to use).

I want to have

oski14,safe,0,13,53,4
oski15,Unknow,empty,empty,empty,0
oski16,Unknow,empty,empty,empty,0
oski17,Unknow,empty,empty,empty,0
oski18,unsafe,0.55,empty,1,2
oski19,unsafe,0.12,4,empty,56

I tried to replace just the 3rd column to see if I was on the right track

    awk -F '[[:space:]]' '$2 && !$3{$3="empty"}1' file

this left me with

oski14,safe,0,13,53,4
oski15,Unknow,,,,0
oski16,Unknow,,,,0
oski17,Unknow,,,,0
oski18,unsafe,0.55,,1,2
oski19,unsafe,0.12,4,,56

I have also tried

    nawk -F, '{$3="\ "?"empty":$3;print}' OFS="," file

this resulted in

oski14,safe,empty,13,53,4
oski15,Unknow,empty,,,0
oski16,Unknow,empty,,,0
oski17,Unknow,empty,,,0
oski18,unsafe,empty,,1,2
oski19,unsafe,empty,4,,56

Lastly I tried

    awk '{if (!$3) {print $1,$2,"empty"} else {print $1,$2,$3}}' file

this left me with

oski14,safe,empty,13,53,4 empty
oski15,Unknow,empty,,,0 empty
oski16,Unknow,empty,,,0 empty
oski17,Unknow,empty,,,0 empty
oski18,unsafe,empty,,1,2 empty
oski19,unsafe,empty,4,,56 empty

Upvotes: 4

Views: 854

Answers (2)

Enlico
Enlico

Reputation: 28500

This is the awk command

awk 'BEGIN { FS=","; OFS="," }; { for (i=1;i<=NF;i++) { if ($i == "") { $i = "empty" }}; print $0 }' yourfile

As suggested in the comments, you can shorten the BEGIN procedure to FS=OFS="," as awk allows chained assignment (which I did not know, thank you @EdMorton).

I've set FS="," in the BEGIN procedure instead of using the -F, option just for uniformity with setting OFS=",".

Clearly you can put the script in a more nice looking form:

#!/usr/bin/awk -f
BEGIN {
  FS  = ","
  OFS = ","
}
{
  for (i = 1; i <= NF; ++i)
    if ($i == "")
      $i = "empty"
  print $0
}

and use it as a standalone program (you have to chmod +x it), even if this is known to have some drawbacks (consult the comments to this question as well as this answer):

./the_script_above your_file

or

down_the_pipe | ./the_script_above | further_processing

Clearly you are still able to feed the above script to awk this way:

awk -f the_script_above file1 file2

Upvotes: 1

Ed Morton
Ed Morton

Reputation: 204558

With a sed that supports EREs with a -E argument (e.g. GNU sed or OSX/BSD sed):

$ sed -E 's/(^|,)(,|$)/\1empty\2/g; s/(^|,)(,|$)/\1empty\2/g' file
oski14,safe,0,13,53,4
oski15,Unknow,empty,empty,empty,0
oski16,Unknow,empty,empty,empty,0
oski17,Unknow,empty,empty,empty,0
oski18,unsafe,0.55,empty,1,2
oski19,unsafe,0.12,4,empty,56

You need to do the substitution twice because given contiguous commas like ,,, one regexp match would use up the first 2 ,s and so you'd be left with ,empty,,.

The above would change a completely empty line into empty, let us know if that's an issue.

Upvotes: 4

Related Questions