Reputation: 19
RequestID CustomerID Status
101 101111 Error
102 323232 Success
103 33434 Error
So, I'm trying to print out the first field and second filed using split option. The delimiter is tab above. I know there are various other methods but I'm trying to learn split function in awk. I'm trying the below code:
awk '{split($1,a,"\t");split($2,b,"\t");print a[1], b[2]}' data
The above code prints only the first column($1) not the column($2). Any specific reason why ?
Thanks,
Upvotes: 0
Views: 3947
Reputation: 204035
split takes 3 arguments:
FS
if absent.Given that it should be obvious that your code should be:
awk '{split($0,a,/\t/); print a[1], a[2]}' data
Note that the 3rd arg to split() is an RE and so you should NOT do either of these things suggested elsethread:
awk '{split($0,a,"\t")...
awk '{split($0,a,FS)...
"\t"
is wrong because that is a constant string not a constant RE (/\t/
)and so requires awk to parse it twice which leads to complications when escaping characters.
FS
is wrong because that's just redundantly specifying the default that you'd get from split($0,a)
.
Upvotes: 1
Reputation: 45293
in awk, the default field separators is whitespace, here is whitespace
definition:
Fields are normally separated by whitespace sequences (spaces, TABs, and newlines), not by single spaces.
So in your code, when you use $1 and $2, you already split the line with default field separator (whitespace). If you need try the split function, you need target on $0 (the whole line), others have provide the solution, I needn't write again.
One tip in your case, use FS
as fieldsep in split function, so you needn't care of if there is space, several spaces, tab or other mixed whitespace, such as:
awk '{split($0,a,FS); print a[1],a[2]}' file
Upvotes: 0
Reputation: 77145
This is how the split function works:
$ cat file
RequestID CustomerID Status
101 101111 Error
102 323232 Success
103 33433 Error
$ awk '{split($0,a,"\t"); print a[1],a[2]}' file
RequestID CustomerID
101 101111
102 323232
103 33433
Function takes string (which in your case should be your entire line, i.e $0
) followed by an array name, in this case a
. Lastly the delimiter which by default is space if not provided (in your case a "\t"
).
Upvotes: 1
Reputation: 77137
It is printing a[1]
, which is the entire first field, and b[2]
, which is empty, because you're splitting the entire second field, for example, '101111' on tabs, which will be an array with one element.
Unless you change the field separator, awk will split input rows into fields on whitespace, so splitting on tabs is redundant. You could just print $1, $2
. If you really want to see the split function in operation, try something other than whitespace:
awk '{split($1, a, "0"); print a[1], a[2];}' < input
1 1
1 2
1 3
Upvotes: 1