Reputation: 4643
Here is my file
$ cat -v test2
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description"
"2015-10-06","427","060","137765","Samsung Korea","Samsung SM-G900I","$39 Plan"
"2015-10-06","592","620","0","Apple Inc","Apple iPhone 6 (A1586)","PREPAY STD - TRIAL - #16"
"2015-10-06","007","290","0","Apple Inc","Apple iPhone 6 (A1586)","PREPAY PLUS - $0 -"
"2015-10-06","592","050","48836832","Apple Inc","Apple iPhone 5S (A1530)","Talk and Text Connect Flexi Plan"
"2016-04-27","498","220","146610","Guangdong Oppo Mobile Telecommunications Corp Ltd","OPPO X9076,OPPO R6006,OPPO R6001,OPPO N5116,OPPO X9006","$29.95 Carryover Plan (1GB)"
"2015-10-06","409","720","113755347","Samsung Korea","Samsung SM-G360G","$29 CARRYOVER PLAN"
"2015-10-06","742","620","19840943","Apple Inc","Apple iPhone S (A1530)","PREPAY STD - $0 - #2"
"2015-10-06","387","180","0","HUAWEI Technologies Co Ltd","HUAWEI HUAWEI G526-L11","PREPAY STD - $1 - #4"
this command add a column to the end
$ awk -F, -v OFS=, -v q='"' 'NR==1{$8=q"Data_Volume_MB"q} NR>1{$8=$4; gsub(/"/,"",$8); $8= q $8/(1024*1024)q}1' test2 | cat -v
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description","Data_Volume_MB"
"2015-10-06","427","060","137765","Samsung Korea","Samsung SM-G900I","$39 Plan","0.131383"
"2015-10-06","592","620","0","Apple Inc","Apple iPhone 6 (A1586)","PREPAY STD - TRIAL - #16","0"
"2015-10-06","007","290","0","Apple Inc","Apple iPhone 6 (A1586)","PREPAY PLUS - $0 -","0"
"2015-10-06","592","050","48836832","Apple Inc","Apple iPhone 5S (A1530)","Talk and Text Connect Flexi Plan","46.5744"
"2016-04-27","498","220","146610","Guangdong Oppo Mobile Telecommunications Corp Ltd","OPPO X9076,OPPO R6006,"0.139818",OPPO N5116,OPPO X9006","$29.95 Carryover Plan (1GB)"
"2015-10-06","409","720","113755347","Samsung Korea","Samsung SM-G360G","$29 CARRYOVER PLAN","108.486"
"2015-10-06","742","620","19840943","Apple Inc","Apple iPhone S (A1530)","PREPAY STD - $0 - #2","18.9218"
"2015-10-06","387","180","0","HUAWEI Technologies Co Ltd","HUAWEI HUAWEI G526-L11","PREPAY STD - $1 - #4","0"
my problem is this line
"2016-04-27","498","220","146610","Guangdong Oppo Mobile Telecommunications Corp Ltd","OPPO X9076,OPPO R6006,"0.139818",OPPO N5116,OPPO X9006","$29.95 Carryover Plan (1GB)"
It changes to this
"2016-04-27","498","220","146610","Guangdong Oppo Mobile Telecommunications Corp Ltd","OPPO X9076,OPPO R6006,"0.139818",OPPO N5116,OPPO X9006","$29.95 Carryover Plan (1GB)"
which has this "0.139818"
in the wrong place.
It does not turn out like the others. The issue seems to be the commas that are enclosed in the double quotes in this column:
"OPPO X9076,OPPO R6006,"0.139818",OPPO N5116,OPPO X9006"
What is the best way, or is it possible, to achieve this? this is what I want the line to look like, like the other lines.
"2016-04-27","498","220","146610","Guangdong Oppo Mobile Telecommunications Corp Ltd","OPPO X9076,OPPO R6006,OPPO N5116,OPPO X9006","$29.95 Carryover Plan (1GB)","0.139818"
Maybe I need to tidy the data, this line in particular before it gets to awk.
change the delimiter from , to ; and add the new column at the end
$ sed 's/","/";"/g' < test2 | awk -F';' -v OFS=';' -v q='"' 'NR==1{$8=q"Data_Volume_MB"q} NR>1{n=$4; gsub(/"/,"",n); $8= q n/(1024*1024)q}1'
"Rec Open Date";"MSISDN";"IMEI";"Data Volume (Bytes)";"Device Manufacturer";"Device Model";"Product Description";"Data_Volume_MB"
"2015-10-06";"427";"060";"137765";"Samsung Korea";"Samsung SM-G900I";"$39 Plan";"0.131383"
"2015-10-06";"592";"620";"0";"Apple Inc";"Apple iPhone 6 (A1586)";"PREPAY STD - TRIAL - #16";"0"
"2015-10-06";"007";"290";"0";"Apple Inc";"Apple iPhone 6 (A1586)";"PREPAY PLUS - $0 -";"0"
"2015-10-06";"592";"050";"48836832";"Apple Inc";"Apple iPhone 5S (A1530)";"Talk and Text Connect Flexi Plan";"46.5744"
"2016-04-27";"498";"220";"146610";"Guangdong Oppo Mobile Telecommunications Corp Ltd";"OPPO X9076,OPPO R6006,OPPO R6001,OPPO N5116,OPPO X9006";"$29.95 Carryover Plan (1GB)";"0.139818"
"2015-10-06";"409";"720";"113755347";"Samsung Korea";"Samsung SM-G360G";"$29 CARRYOVER PLAN";"108.486"
"2015-10-06";"742";"620";"19840943";"Apple Inc";"Apple iPhone S (A1530)";"PREPAY STD - $0 - #2";"18.9218"
"2015-10-06";"387";"180";"0";"HUAWEI Technologies Co Ltd";"HUAWEI HUAWEI G526-L11";"PREPAY STD - $1 - #4";"0"
change the delimiter from , to | and add the new column at the end
$ sed 's/","/"|"/g' < test2 | awk -F'|' -v OFS='|' -v q='"' 'NR==1{$8=q"Data_Volume_MB"q} NR>1{n=$4; gsub(/"/,"",n); $8= q n/(1024*1024)q}1'
"Rec Open Date"|"MSISDN"|"IMEI"|"Data Volume (Bytes)"|"Device Manufacturer"|"Device Model"|"Product Description"|"Data_Volume_MB"
"2015-10-06"|"427"|"060"|"137765"|"Samsung Korea"|"Samsung SM-G900I"|"$39 Plan"|"0.131383"
"2015-10-06"|"592"|"620"|"0"|"Apple Inc"|"Apple iPhone 6 (A1586)"|"PREPAY STD - TRIAL - #16"|"0"
"2015-10-06"|"007"|"290"|"0"|"Apple Inc"|"Apple iPhone 6 (A1586)"|"PREPAY PLUS - $0 -"|"0"
"2015-10-06"|"592"|"050"|"48836832"|"Apple Inc"|"Apple iPhone 5S (A1530)"|"Talk and Text Connect Flexi Plan"|"46.5744"
"2016-04-27"|"498"|"220"|"146610"|"Guangdong Oppo Mobile Telecommunications Corp Ltd"|"OPPO X9076,OPPO R6006,OPPO R6001,OPPO N5116,OPPO X9006"|"$29.95 Carryover Plan (1GB)"|"0.139818"
"2015-10-06"|"409"|"720"|"113755347"|"Samsung Korea"|"Samsung SM-G360G"|"$29 CARRYOVER PLAN"|"108.486"
"2015-10-06"|"742"|"620"|"19840943"|"Apple Inc"|"Apple iPhone S (A1530)"|"PREPAY STD - $0 - #2"|"18.9218"
"2015-10-06"|"387"|"180"|"0"|"HUAWEI Technologies Co Ltd"|"HUAWEI HUAWEI G526-L11"|"PREPAY STD - $1 - #4"|"0"
change the delimiter from , to ; and to insert it before the second last column
$ sed 's/","/";"/g' < test2 | awk -F';' -v OFS=';' -v q='"' 'NR==1{$(NF-1)=q"Data_Volume_MB"q FS $(NF-1)} NR>1{n=$4; gsub(/"/,"",n); $(NF-1)= q n/(1024*1024)q FS $(NF-1)}1'
"Rec Open Date";"MSISDN";"IMEI";"Data Volume (Bytes)";"Device Manufacturer";"Data_Volume_MB";"Device Model";"Product Description"
"2015-10-06";"427";"060";"137765";"Samsung Korea";"0.131383";"Samsung SM-G900I";"$39 Plan"
"2015-10-06";"592";"620";"0";"Apple Inc";"0";"Apple iPhone 6 (A1586)";"PREPAY STD - TRIAL - #16"
"2015-10-06";"007";"290";"0";"Apple Inc";"0";"Apple iPhone 6 (A1586)";"PREPAY PLUS - $0 -"
"2015-10-06";"592";"050";"48836832";"Apple Inc";"46.5744";"Apple iPhone 5S (A1530)";"Talk and Text Connect Flexi Plan"
"2016-04-27";"498";"220";"146610";"Guangdong Oppo Mobile Telecommunications Corp Ltd";"0.139818";"OPPO X9076,OPPO R6006,OPPO R6001,OPPO N5116,OPPO X9006";"$29.95 Carryover Plan (1GB)"
"2015-10-06";"409";"720";"113755347";"Samsung Korea";"108.486";"Samsung SM-G360G";"$29 CARRYOVER PLAN"
"2015-10-06";"742";"620";"19840943";"Apple Inc";"18.9218";"Apple iPhone S (A1530)";"PREPAY STD - $0 - #2"
"2015-10-06";"387";"180";"0";"HUAWEI Technologies Co Ltd";"0";"HUAWEI HUAWEI G526-L11";"PREPAY STD - $1 - #4"
Upvotes: 1
Views: 693
Reputation: 1628
I would suggest changing your field separator first, like so (here I am changing it from ,
to |
):
sed 's/","/"|"/g' < test2 > newfile
Then use your awk
code on newfile
.
You could put this all in one line of course (I'm not using your awk
code here but just my own awk
code as an example):
sed 's/","/"|"/g' < test2 | awk 'BEGIN{FS="|"} {print $1}'
In response to OP comments, be sure to run your command as such (notice I changed the -F,
to -F"|"
:
sed 's/","/"|"/g' < test2 | awk -F"|" -v OFS=, -v q='"' 'NR==1{$8=q"Data_Volume_MB"q} NR>1{$8=$4; gsub(/"/,"",$8); $8= q $8/(1024*1024)q}1'
Using your data, here is my results:
"Rec Open Date","MSISDN","IMEI","Data Volume (Bytes)","Device Manufacturer","Device Model","Product Description","Data_Volume_MB"
"2015-10-06","427","060","137765","Samsung Korea","Samsung SM-G900I","$39 Plan","0.131383"
"2015-10-06","592","620","0","Apple Inc","Apple iPhone 6 (A1586)","PREPAY STD - TRIAL - #16","0"
"2015-10-06","007","290","0","Apple Inc","Apple iPhone 6 (A1586)","PREPAY PLUS - $0 -","0"
"2015-10-06","592","050","48836832","Apple Inc","Apple iPhone 5S (A1530)","Talk and Text Connect Flexi Plan","46.5744"
"2016-04-27","498","220","146610","Guangdong Oppo Mobile Telecommunications Corp Ltd","OPPO X9076,OPPO R6006,OPPO R6001,OPPO N5116,OPPO X9006","$ Carryover Plan (1GB)","0.139818"
"2015-10-06","409","720","113755347","Samsung Korea","Samsung SM-G360G","$29 CARRYOVER PLAN","108.486"
"2015-10-06","742","620","19840943","Apple Inc","Apple iPhone S (A1530)","PREPAY STD - $0 - #2","18.9218"
"2015-10-06","387","180","0","HUAWEI Technologies Co Ltd","HUAWEI HUAWEI G526-L11","PREPAY STD - $1 - #4","0"
Upvotes: 1