Reputation: 947
There is a table in the below format. Is it possible to have an AWK script to format the table in such a way it excludes columns which contain only the number "1"?
ST L1 L2 L3 L4 L5
ST2 1 1 1 1 1
ST2 1 0 1 0 1
ST3 1 0 1 0 1
ST3 0 0 1 1 1
ST4 1 0 1 0 1
ST5 1 0 1 0 1
ST6 1 0 1 0 1
ST7 0 0 1 1 1
ST8 0 0 1 0 1
ST9 1 0 1 0 1
Output should be as below:
ST L1 L2 L4
ST2 1 1 1
ST2 1 0 0
ST3 1 0 0
ST3 0 0 1
ST4 1 0 0
ST5 1 0 0
ST6 1 0 0
ST7 0 0 1
ST8 0 0 0
ST9 1 0 0
I can sort of understand the logic in how a column should be printed, as in whatever the value of NR in the end block, if that is equal to the variable which should be incremented each time 1 is found, for a given column (except header NR==1 and column $1), print the column. My trouble lies in actually trying to print the columns in the end block, as I am trying to use arrays and I am still learning AWK and array's. I am sure there is some clever way out there of doing this though without even using arrays and simply changing the way AWK looks at the data.
Upvotes: 2
Views: 395
Reputation: 8651
This should do the trick:
{
# store current line
line[FNR] = $0
if (FNR > 1) # skip header
{
# select columns
for (i = 1 ; i <= NF ; i++)
{
if ($i != 1) selected[i] = 1
}
}
}
END {
for (li = 1 ; li <= FNR ; li++)
{
# parse current line
$0 = line[li]
# pick selected fields
for (i = j = 1 ; i <= NF ; i++)
{
if (selected[i]) $(j++) = $i
}
# trim record to selection
NF = j-1
print
}
}
After Ed Morton's remarks:
l
to something less ambiguousprint ""
being better than printf "\n"
After a second batch of remarks:
Thanks a lot for the proofreading. It's been nearly 15 years since I last did some serious hawk programming, and the rust has sadly set in.
Upvotes: 2
Reputation: 203254
awk '
NR==FNR {
if (NR > 1) {
for (i=1;i<=NF;i++) {
if ($i != 1) {
nonOnes[i]
}
}
}
next
}
{
ofs=""
for (i=1;i<=NF;i++) {
if (i in nonOnes) {
printf "%s%s", ofs, $i
ofs=OFS
}
}
print ""
}
' file file
ST L1 L2 L4
ST2 1 1 1
ST2 1 0 0
ST3 1 0 0
ST3 0 0 1
ST4 1 0 0
ST5 1 0 0
ST6 1 0 0
ST7 0 0 1
ST8 0 0 0
ST9 1 0 0
If you don't want to list the same file twice on the command line you can tweak to add this BEGIN section:
BEGIN { ARGV[ARGC] = ARGV[ARGC-1]; ARGC++ }
Upvotes: 2