Prashant Lakhera
Prashant Lakhera

Reputation: 880

Filtering Data in CSV file

I have CSV file in this format

a,b,c,d,e,f,no disk detected
a,b,c,d,e,f,disk run into error
a,b,c,d,e,f,no memory in the server
a,b,c,d,e,f,memory has correctable errors

in the last column, I need to search for the word

disk and replace it with disk error memory and replace it with a memory error That part I already figure out

 cat filename.csv |awk -F "," '{print $NF}' |sed 's/^.*disk.*$/disk error/'  |sed 's/^.*memory.*$/memory error/' 

Now the part I need help to when sed replace this string, can it's possible to write into the same file(filename.csv) or generate a new file with all columns + updated column, so new file will look like this

 a,b,c,d,e,f,disk error
 a,b,c,d,e,f,disk error
 a,b,c,d,e,f,memory error
 a,b,c,d,e,f,memory error

Upvotes: 1

Views: 57

Answers (2)

Tyl
Tyl

Reputation: 5252

An awk solution:

awk -F, '{if ($NF~/disk/) $NF="disk error"; if ($NF~/memory/) $NF="memory error";}1' OFS=, file

With GNU awk v4.1.0+, you can add -i inplace switch to change the file inplace.
Otherwise, try awk .... file | tee file.

So above command itself does not require GNU awk, but if you have GNU awk, you can do it with this more concise way:

awk -F, '{match($NF,"(disk|memory)",m);$NF=m[1] " error";}1' OFS=, file

NF means column number, $NF means the last column.
-F, set the FS field separator to comma.
OFS=, set the outout field separator to comma.

Upvotes: 2

anubhava
anubhava

Reputation: 784998

It is easier to do with sed:

sed -E 's/^(.+,).*(disk|memory).*$/\1\2 error/' file.csv

a,b,c,d,e,f,disk error
a,b,c,d,e,f,disk error
a,b,c,d,e,f,memory error
a,b,c,d,e,f,memory error

To make changes inline in same file use:

sed -i.bak -E 's/^(.+,).*(disk|memory).*$/\1\2 error/' file.csv

== Details ==

Search Regex:

  • ^: Start
  • (.+,): Greedy Match till last comma and capture it in group #1
  • .*(disk|memory): Match 0 or more characters before matching disk or memory and capture it in group #2
  • .*$: Match 0 or more characters before end

Replacement Pattern:

  • \1: Back-reference to group #1 to place text till last comma back
  • \2 error: Append disk error or memory error

Upvotes: 2

Related Questions