quyleanh
quyleanh

Reputation: 63

awk inline command and full script has different output

I want to count the number of starting space at the beginning of line. My sample text file is following

aaaa bbbb cccc dddd
  aaaa bbbb cccc dddd
    aaaa bbbb cccc dddd
aaaa bbbb cccc dddd

Now when I write a simple script to count, I notice the different between inline command and full script of awk ouput.

First try

#!/bin/bash
while IFS= read -r line; do
    echo "$line" | awk '
        {
            FS="[^ ]"
            print length($1)
        }
    '
done < "tmp"

The output is

4
4
4
4

Second try

#!/bin/bash
while IFS= read -r line; do
    echo "$line" | awk -F "[^ ]" '{print length($1)}'
done < "tmp"

The output is

0
2
4
0

I want to write a full script which has inline type output.
Could anyone explain me about this different? Thank you very much.

Upvotes: 1

Views: 554

Answers (3)

stack0114106
stack0114106

Reputation: 8711

You can try Perl. Simply capture the leading spaces in a group and print its length. "a"=~/a/ is just to reset the regex captures at the end of each line.

perl -nle ' /(^\s+)/; print length($1)+0; "a"=~/a/ '  count_space.txt
0
2
4
0

Upvotes: 0

James Brown
James Brown

Reputation: 37394

Fixed your first try:

$ while IFS= read -r line; do
    echo "$line" | awk '
                   BEGIN {              # you forgot the BEGIN
                       FS="[^ ]"        # gotta set FS before record is read
                   }
                   {
                       print length($1)
                   }' 
  done < file

Output now:

0
2
4
0

And to speed it up, just use awk for it:

$ awk '
BEGIN {
    FS="[^ ]"
}
{
    print length($1)
}' file

Upvotes: 3

RavinderSingh13
RavinderSingh13

Reputation: 133428

Could you please try following without changing FS. Written and tested it in https://ideone.com/N8QcC8

awk '{if(match($0,/^ +/)){print RSTART+RLENGTH-1} else{print 0}}' Input_file

OR try:

awk '{match($0,/^ */); print RLENGTH}' Input_file

Output will be:

0
2
4
0

Explanation: in first solution simply using if and else condition. In if part I am using match function of awk and giving regex in it to match initial spaces of line in it. Then printing sum of RSTART+RLENGTH-1 to print number of spaces. Why it prints it because RSTART and RLENGTH are default variables of awk who gets set when a regex match is found.

On 2nd solution as per rowboat suggestion simply printing RLENGTH which will take care of printing 0 too without using if else condition.

Upvotes: 3

Related Questions