Reputation: 540
I've got a bunch of files that have more values (unnecessary values) than fields name and on top of it I've got a header I would like to keep.
For example, with a test_awk.txt file containing:
My header is here
it can have several lines
data1 data2 data3
1,2,3,4
2,3,4,5
What I want to have is the following:
My header is here
it can have several lines
data1,data2,data3
1,2,3
2,3,4
I tried with a simple awk command, but can only suppress the column for the whole file. Thus deleting my header, and most important, the last field name:
awk 'BEGIN{FS=OFS=","}NR>2{NF--;print}' test_awk.txt
gives:
data1,data2
1,2,3
2,3,4
Upvotes: 2
Views: 49
Reputation: 133458
Following awk
may help you in same.
awk -F' |,' '/^data/{val=NF;} /^[0-9]/ && NF>val{NF=val} 1' OFS=, Input_file
Output will be as follows.
My header is here
it can have several lines
data1 data2 data3
1,2,3
2,3,4
Explanation: Adding non-one liner form with explanation too here:
awk -F' |,' ' ##Making field separator as space or comma for each line of Input_file here.
/^data/{ ##Checking condition here if a line is starting from string data, if yes then do following:
val=NF; ##Creating variable named val and its value is value of the number of fields on current line of Input_file.
}
/^[0-9]/ && NF>val{ ##Checking condition here if any line starts from digits and value of current NF is greater than variable val then do following:
NF=val ##Assigning the value of NF to variable named val here.
}
1 ##Mentioning 1 here will make sure I we are making condition TRUE here and not mentioning any action here so by default print of current line will happen as an action.
' OFS=, Input_file ##Setting OFS(output field separator) as comma here and mentioning Input_file here.
Upvotes: 2
Reputation: 23667
NR>3{NF--;print}
means if NR>3, change NF and print it.. so this misses out printing lines NR<=3
$ awk 'BEGIN{FS=OFS=","} NR>3{NF--} 1' test_awk.txt
My header is here
it can have several lines
data1 data2 data3
1,2,3
2,3,4
NR>3{NF--}
change number of fields only for line numbers > 31
idiomatic way to print contents of $0
You can also use sed
$ sed '4,$ s/,[^,]*$//' test_awk.txt
My header is here
it can have several lines
data1 data2 data3
1,2,3
2,3,4
4,$
substitution will apply to only these lines - i.e line numbers > 3s/,[^,]*$//
delete last fieldUpvotes: 3