Reputation: 942
Given this input_file:
1234 1234 abcd
1234 abcd
awk doesn't recognise an empty column, when I run:
awk '{print $1,$2}' input_file
I get:
1234 1234
1234 abcd
How to make awk to give me:
1234 1234
1234
Upvotes: 1
Views: 2801
Reputation: 449
I think the most simple approach is to declare the field separator as '\t' (assuming it's indeed tab-delimited).
awk -F'\t' '{print $1,$2}' file_name
Your code should work now as expected.
Upvotes: 1
Reputation: 881513
The awk
program usually uses field separators to decide what characters belong in what fields. If your second line contains only spaces, there's no way to use that method to split as you wish.
However, GNU awk
allows you to set a FIELDWIDTHS
variable which will better suit fixed-width data, since that appears to be what you have:
pax> cat infile
1234 5678 abcd
1234 abcd
pax> awk 'BEGIN{FIELDWIDTHS="4 1 4"}{print "<"$1","$3">"}' infile
<1234,5678>
<1234, >
It's field one and three in this case since field two is the space between the first and second real column:
1234 5678 abcd
\__/|\__/|\__/
1 2 3 4 5
I usually do that since I don't want the space to become part of the data (in case I want a different character in the output as in my example) but, if you're transferring the space anyway, you could also use the simpler:
pax> awk 'BEGIN{FIELDWIDTHS="5 4"}{print "<"$1$2">"}' infile
<1234 5678>
<1234 >
In that case, field 1 is the five characters 1234<space>
.
If you want to do fixed width processing but with the ability to easily adapt to later width changes, you can modify the awk
script so it gets that information from the file itself.
Not from the actual data lines since the fields there may have spaces, but you can add a header line to fully specify the widths to use (ensuring the header line isn't treated as data of course).
The following transcript shows this in action (the awk
script is now in a file since it's getting complex):
pax> cat infile
#### ###### ####
1234 567890 abcd
1234 abcd
pax> cat awkfile.awk
NR == 1 {
# Header: construct field widths string
# "a 1 b 1 c 1 d ... z"
# where a..z are lengths of fields.
FIELDWIDTHS = length($1)
for (i = 2; i < NF; i++) {
FIELDWIDTHS = FIELDWIDTHS" 1 "length($i)
}
next
}
{
# Then use that FIELDWIDTHS string for
# all other records.
print "<"$1","$3">"
}
pax> awk -f awkfile.awk infile
<1234,567890>
<1234, >
You'll find that you can change the field lengths as much as you want and, provided the header line is correct, it will adapt.
Upvotes: 3
Reputation: 6345
Having field delimiter == field is kind of impossibe. You need to consider manipulation of input data.
Here are some examples for fixed width fields:
$ awk '{gsub(" [[:space:]]{4} "," ---- ");print}' file1
1234 1234 abcd
1234 ---- abcd
You can revert back anytime:
$ awk '{gsub(" [[:space:]]{4} "," ---- ");print}' file1 |awk '{gsub("----"," ");print}'
1234 1234 abcd
1234 abcd
For a non-fixed width situation, you can use something like this bellow, that will transform a sequence of more than two spaces in something else:
$ awk '{gsub(" [[:space:]]{2,} "," - ");print}' file
1234 1234 abcd
1234 - abcd
Upvotes: 2