Reputation: 7469
I have a large file with hundreds of columns that I want to remove only the third and fourth columns from and print the rest to a file. My initial idea was to make an awk script like awk '{print $1, $2, for (i=$5; i <= NF; i++) print $i }' file > outfile
. However, this code does not work.
I then tried:
awk '{for(i = 1; i<=NF; i++)
if(i == 3 || i == 4) continue
else
print($i)}' file > outfile
But this just printed everything out in one field. It would be possible to split this up into two scripts and combine them with unix paste
but this seems like something that should be able to be done in one line.
Upvotes: 22
Views: 38565
Reputation: 10039
The hard but generic way (to forget for a simple oneliner)
awk -v "Exclude=3:4:5" '
# load exclusion
BEGIN{
Count=split(Exclude, aTmp, ":")
for( i = 1; i <= Count; i++) aExc[ aTmp[ i]]=1
}
# treat each line, taking only wanted field
{
Result=""
for( i = 1; i <= NF; i++) {
# field to take ?
if( ! aExc[ i]) {
# first element or add a separator before
if( Result != "") Result=Result OFS $i
else Result=$i
}
}
print Result
}' YourFile
:
in first lineUpvotes: 0
Reputation: 327
Yes, it's possible to just set the third and fourth columns to an empty string; but, in addition, field $1
should be set to itself ($1=$1
) to make awk
actually consume the input field separator (delimeter) :
on the entire current line $0
in one go.
echo 1:2:3:4:5:6:7:8:9:10 | awk -F: '{ $1=$1; $3=""; $4=""; print $0}'
Upvotes: 4
Reputation: 933
What about something like:
cat SOURCEFILE | cut -f1-2,5- >> DESTFILE
It prints the first two columns, skips the 3rd and 4rth, and then prints from 5 onwards to the end.
Upvotes: 18
Reputation: 19665
Say you have a tab delimited file that looks like the following:
temp.txt
field1 field2 field3 field4 field5 field6
field1 field2 field3 field4 field5 field6
field1 field2 field3 field4 field5 field6
running the following will remove field 3 and 4 and output to end of line.
awk '{print $1"\t"$2"\t"substr($0, index($0,$5))}' temp.txt
field1 field2 field5 field6
field1 field2 field5 field6
field1 field2 field5 field6
My example(s) print to stdout.
> newFile
will send stdout to newFile and >> newFile
will append to newFile.
So you may want to use the following:
awk '{print $1"\t"$2"\t"substr($0, index($0,$5))}' temp.txt > newFile.txt
some will argue for cut
cut -f1,2,5- temp.txt
which produce the same output, and cut is great for simplicity, but does not handle inconsistent delimiters. For example mixture of different whitespaces. However, in this case cut may be what you are after.
you could also accomplish this in perl,python,ruby,and many others, but here is the simplest awk
solution.
Upvotes: 7
Reputation: 121
How about just setting the third and fourth columns to an empty string:
echo 1 2 3 4 5 6 7 8 9 10 |
awk -F" " '{ $3=""; $4=""; print}'
Upvotes: 12
Reputation: 225112
Your first try was pretty close. Modifying it to use printf
and including the field separators worked for me:
awk '{printf $1FS$2; for (i=5; i <= NF; i++) printf FS$i; print NL }'
Upvotes: 20