Reputation: 687
I have data like this:
# data_display
ab as we hj kl
12 34 45 83 21
45 56 98 45 09
I need just the first column alone, and only the rows starting with numbers.
I now use:
# data_display | awk '{ print $1 }' | grep "^[0-9]"
Is there any way to optimise it more, like using the regex in awk itself?
I am very new to awk.
Thanks.
KK
Upvotes: 1
Views: 1797
Reputation: 1099
cut -d' ' -f1 filename | grep '^[0-9]'
this should be the fastest. since awk looks and classifies the file as records and fields.
here we are minimizing the amount of data that grep needs to process by cutting the first field.
Upvotes: 1
Reputation: 343191
for more accuracy, check for actual numbers (in case you have data like 1a, which is not a number but will match using the solution given so far.
$ awk '$1+0==$1' file
or
awk '$1 ~/^[0-9]+$/' file
Upvotes: 1
Reputation: 51561
You could use cut
instead of awk
:
$ data_display | grep '^[0-9]' | cut -f 1 -d ' '
Upvotes: 1
Reputation: 23383
You can place the grep regexp in the awk command directly:
data_display | awk '/^[0-9]/{ print $1 }'
Upvotes: 2
Reputation: 67330
In awk, regular expressions come before the print statement including the curly braces. So in your case, the awk call would be:
awk '/^[0-9]/ {print $1}'
Upvotes: 6
Reputation: 882756
Sure you can:
pax> echo 'ab as we hj kl
12 34 45 83 21
45 56 98 45 09' | awk '/^[0-9]/ {print $1}'
gives you:
12
45
Awk
commands consist of an actual pattern to match and a command to run. If there's no pattern, the command runs for all lines.
Upvotes: 0