Reputation: 3362
I have a input file, which is tab delimited, but I want to remove all empty columns. Empty columns : $13=$14=$15=$84=$85=$86=$87=$88=$89=$91=$94
INPUT: tsv file with more than 90 columns
a b d e g...
a b d e g...
OUTPUT: tsv file without empty columns
a b d e g....
a b d e g...
Thank you
Upvotes: 2
Views: 4462
Reputation: 26481
remove ALL empty columns:
If you have a tab-delimited file, with empty columns and you want to remove all empty columns, it implies that you have multiple consecutive tabs. Hence you could just replace those with a single tab and delete then the first starting tab if you also removed the first column:
sed 's/\t\+/\t/g;s/^\t//' <file>
remove SOME columns: See Ed Morton or just use cut
:
cut --complement -f 13,14,15,84,85,86,87,88,89,91,94 <file>
remove selected columns if and only if they are empty:
Basically a simple adaptation from Ed Morton :
awk 'BEGIN{FS=OFS="\t"; n=split(col,a,",")}
{ for(i=1;i<=n;++i) if ($a[i]=="") $a[i]=RS; gsub("(^|"FS")"RS,"") }
1' col=13,14,15,84,85,86,87,88,89,91,94 <file>
Upvotes: 3
Reputation: 203607
This might be what you want:
$ printf 'a\tb\tc\td\te\n'
a b c d e
$ printf 'a\tb\tc\td\te\n' | awk 'BEGIN{FS=OFS="\t"} {$2=$4=""} 1'
a c e
$ printf 'a\tb\tc\td\te\n' | awk 'BEGIN{FS=OFS="\t"} {$2=$4=RS; gsub("(^|"FS")"RS,"")} 1'
a c e
Note that the above doesn't remove all empty columns as some potential solutions might do, it only removes exactly the column numbers you want removed:
$ printf 'a\tb\t\td\te\n'
a b d e
$ printf 'a\tb\t\td\te\n' | awk 'BEGIN{FS=OFS="\t"} {$2=$4=RS; gsub("(^|"FS")"RS,"")} 1'
a e
Upvotes: 6