Reputation: 687

Unix awk command regex problem

I have data like this:

# data_display  

ab as we hj kl  
12 34 45 83 21  
45 56 98 45 09

I need just the first column alone, and only the rows starting with numbers.

I now use:

# data_display | awk '{ print $1 }' | grep "^[0-9]"

Is there any way to optimise it more, like using the regex in awk itself?

I am very new to awk.

Thanks.

Upvotes: 1

Answers (6)

Reputation: 1099

cut -d' ' -f1 filename | grep '^[0-9]'

this should be the fastest. since awk looks and classifies the file as records and fields.

here we are minimizing the amount of data that grep needs to process by cutting the first field.

Upvotes: 1

Reputation: 343201

for more accuracy, check for actual numbers (in case you have data like 1a, which is not a number but will match using the solution given so far.

$ awk '$1+0==$1' file

awk '$1 ~/^[0-9]+$/' file

Upvotes: 1

Reputation: 51561

You could use cut instead of awk:

$ data_display | grep '^[0-9]' | cut -f 1 -d ' '

Upvotes: 1

Reputation: 23383

You can place the grep regexp in the awk command directly:

data_display | awk '/^[0-9]/{ print $1 }'

Upvotes: 2

Reputation: 67330

In awk, regular expressions come before the print statement including the curly braces. So in your case, the awk call would be:

awk '/^[0-9]/ {print $1}'

Upvotes: 6

Reputation: 882776

Sure you can:

pax> echo 'ab as we hj kl  
12 34 45 83 21  
45 56 98 45 09' | awk '/^[0-9]/ {print $1}'

gives you:

12
45

Awk commands consist of an actual pattern to match and a command to run. If there's no pattern, the command runs for all lines.

Upvotes: 0