Reputation: 2725
I have a comma delimited file that looks like the following
Start,Card,App,Description,Internet
2013-01-02 11:54:00,No Data,Off,Our system has No Data,Off
2013-01-03 05:30:00,Visa,On,The monitoring system is Off,Off
2013-01-05 17:07:00,Master,Off,The system is On,On
2013-01-07 00:08:00,No Data,No Data,No Data,On
2013-01-09 04:40:00,Master,Off,System is Off,On
I want to replace No Data
with NA
, Off
with 0
, and On
with 1
only iff these words are fields, i.e. that is they have comma or beginning of newline or end of line. The required output is
Start,Card,App,Description,Internet
2013-01-02 11:54:00,NA,0,Our system has No Data,0
2013-01-03 05:30:00,Visa,1,The monitoring system is Off,0
2013-01-05 17:07:00,Master,0,The system is On,1
2013-01-07 00:08:00,NA,NA,NA,1
2013-01-09 04:40:00,Master,0,System is Off,1
I tried for example,
sed 's/,Off,/,0,/g' test.txt
will only replace if the number of fields is greater than one or less than maximum number of fields. I also tried using look-arounds and leaving out header like
sed '2,$s/(?<=^|,)Off(?=$|,)/0/g' test.txt
I think this second command should work but didn't. Where did I go wrong? Thanks for the tips. Since the file is biggg, absolute indexing for columns like $4 is not an option.
Upvotes: 0
Views: 74
Reputation: 58478
This might work for you (GNU sed):
sed -r ':a;s/(^|,)No Data(,|$)/\1NA\2/g;s/(^|,)Off(,|$)/\10\2/g;s/(^|,)On(,|$)/\11\2/g;ta' file
Use back references with alternation to replace required string and delimiters.
N.B. Delimiters either side of a required string must be tested twice hence the loop back to the place holder :a
.
Upvotes: 1
Reputation: 23677
sed
doesn't support lookarounds, perl
to the rescue
$ perl -pe 'if($. > 1){ s/(^|,)\KOff(?=,|$)/0/g; s/(^|,)\KOn(?=,|$)/1/g; s/(^|,)\KNo Data(?=,|$)/NA/g; }' ip.txt
Start,Card,App,Description,Internet
2013-01-02 11:54:00,NA,0,Our system has No Data,0
2013-01-03 05:30:00,Visa,1,The monitoring system is Off,0
2013-01-05 17:07:00,Master,0,The system is On,1
2013-01-07 00:08:00,NA,NA,NA,1
2013-01-09 04:40:00,Master,0,System is Off,1
Or, split input line on ,
and perform substitution on individual fields
$ perl -F, -lane 'if($. > 1){ foreach (@F){ s/^No Data$/NA/; s/^Off$/0/; s/^On$/1/ } } print join ",",@F' ip.txt
Start,Card,App,Description,Internet
2013-01-02 11:54:00,NA,0,Our system has No Data,0
2013-01-03 05:30:00,Visa,1,The monitoring system is Off,0
2013-01-05 17:07:00,Master,0,The system is On,1
2013-01-07 00:08:00,NA,NA,NA,1
2013-01-09 04:40:00,Master,0,System is Off,1
with awk
$ awk -v FS=, -v OFS=, 'NR>1{ for(i=1; i<=NF; i++){ sub(/^No Data$/,"NA",$i); sub(/^Off$/,"0",$i); sub(/^On$/,"1",$i) } } 1' ip.txt
Start,Card,App,Description,Internet
2013-01-02 11:54:00,NA,0,Our system has No Data,0
2013-01-03 05:30:00,Visa,1,The monitoring system is Off,0
2013-01-05 17:07:00,Master,0,The system is On,1
2013-01-07 00:08:00,NA,NA,NA,1
2013-01-09 04:40:00,Master,0,System is Off,1
Upvotes: 2
Reputation: 158
awk -F, '{for(i=1;i<=NF;i++) if(split($i,a,"Off|On|No Data| +")==2) {sub(/No Data/,"NA",$i);sub(/Off/,"0",$i);sub(/On/,"1",$i)}}1' OFS="," file
Upvotes: 1
Reputation: 37424
In awk using ternary operators (c ? "a" : "b")
:
$ awk -F, '{for(i=NF;i>0;i--) $i=($i=="No Data"?"NA":($i=="Off"?"0":($i=="On"?"1":$i)))} 1' OFS=, file
Start,Card,App,Description,Internet
2013-01-02 11:54:00,NA,0,Our system has No Data,0
2013-01-03 05:30:00,Visa,1,The monitoring system is Off,0
2013-01-05 17:07:00,Master,0,The system is On,1
2013-01-07 00:08:00,NA,NA,NA,1
2013-01-09 04:40:00,Master,0,System is Off,1
Upvotes: 2