pog7776
pog7776

Reputation: 13

Piping awk output into grep

So I'm writing a bash script to alphabetically list names from a text file, but only names with the same frequency (defined in the second column)

grep -wi '$1' /usr/local/linuxgym-data/census/femalenames.txt |
awk '{ print ($2) }' |
grep '$1' /usr/local/linuxgym-data/census/femalenames.txt |
sort |
awk '{ print ($1) }'

Since I'm doing this for class, I've been given the example of inputting 'ANA', and should return

ANA

RENEE

And the document has about 4500 lines in it

but the two fields I'm looking at have

ANA            0.120     55.989    181

RENEE          0.120     56.109    182

And so I want to find all names with the second column the same as ANA (0.120). The second column is the frequency of the name... This is just dummy data given to me by my school, so I don't know what that means. But if there was another name with the same frequency as ANA (0.120) it would also be listed in the output.

When I run the commands on their own, they work fine, but it seems to have trouble with the 3rd line with using the awk output as $1 in the grep below it.

I am pretty new to this, so I'm most likely doing it in the most roundabout way.

Upvotes: 1

Views: 4557

Answers (3)

viraptor
viraptor

Reputation: 34145

You could probably do this in one line, but that's a pushing it a bit. Split it into two pieces to make it easier to write/read. For example:

name=$1
src=/usr/local/linuxgym-data/census/femalenames.txt

# get the frequency you're after
freq=$(awk -v name="$name" '$1==name {print $2}' "$src")

# get the names with that frequency
awk -v freq="$freq" '$2==freq {print $1}' "$src"

Tradeoff between this and RomanPerekhrest's solution is that their solution will do one scan, but index everything in memory. This one will scan the file twice, but save you the memory.

Upvotes: 1

agc
agc

Reputation: 8406

This should probably do it...

f="/usr/local/linuxgym-data/census/femalenames.txt"
grep $(grep -wi -m 1 "$1" $f | awk '{ print ($2) }') $f | \
  sort | awk '{ print ($1) }'

Test...

echo 'ANA            0.120     55.989    181
RENEE          0.120     56.109    182' > fem
foo() { grep $(grep -wi -m 1 "$1" $f | awk '{ print ($2) }') $f | \
         sort | awk '{ print ($1) }' ; }
f=fem ; foo ANA

Output:

ANA
RENEE

Upvotes: 0

RomanPerekhrest
RomanPerekhrest

Reputation: 92854

With single awk:

inp="ANA"
awk -v inp=$inp '{ a[$1]=$2 } END { if(inp in a){ v=a[inp]; 
       for(i in a){ if(a[i]==v) print i }}
}' /usr/local/linuxgym-data/census/femalenames.txt | sort

The output:

ANA
RENEE

  • a[$1]=$2 - accumulating frequency value for each name

  • if(inp in a){ v=a[inp]; - if the input name inp is in array - get its frequency value

  • for(i in a){ if(a[i]==v) print i - print all names that have the same frequency value as for input name

Upvotes: 0

Related Questions