Reputation: 13
So I'm writing a bash script to alphabetically list names from a text file, but only names with the same frequency (defined in the second column)
grep -wi '$1' /usr/local/linuxgym-data/census/femalenames.txt |
awk '{ print ($2) }' |
grep '$1' /usr/local/linuxgym-data/census/femalenames.txt |
sort |
awk '{ print ($1) }'
Since I'm doing this for class, I've been given the example of inputting 'ANA', and should return
ANA
RENEE
And the document has about 4500 lines in it
but the two fields I'm looking at have
ANA 0.120 55.989 181
RENEE 0.120 56.109 182
And so I want to find all names with the second column the same as ANA (0.120). The second column is the frequency of the name... This is just dummy data given to me by my school, so I don't know what that means. But if there was another name with the same frequency as ANA (0.120) it would also be listed in the output.
When I run the commands on their own, they work fine, but it seems to have trouble with the 3rd line with using the awk output as $1 in the grep below it.
I am pretty new to this, so I'm most likely doing it in the most roundabout way.
Upvotes: 1
Views: 4557
Reputation: 34145
You could probably do this in one line, but that's a pushing it a bit. Split it into two pieces to make it easier to write/read. For example:
name=$1
src=/usr/local/linuxgym-data/census/femalenames.txt
# get the frequency you're after
freq=$(awk -v name="$name" '$1==name {print $2}' "$src")
# get the names with that frequency
awk -v freq="$freq" '$2==freq {print $1}' "$src"
Tradeoff between this and RomanPerekhrest's solution is that their solution will do one scan, but index everything in memory. This one will scan the file twice, but save you the memory.
Upvotes: 1
Reputation: 8406
This should probably do it...
f="/usr/local/linuxgym-data/census/femalenames.txt"
grep $(grep -wi -m 1 "$1" $f | awk '{ print ($2) }') $f | \
sort | awk '{ print ($1) }'
Test...
echo 'ANA 0.120 55.989 181
RENEE 0.120 56.109 182' > fem
foo() { grep $(grep -wi -m 1 "$1" $f | awk '{ print ($2) }') $f | \
sort | awk '{ print ($1) }' ; }
f=fem ; foo ANA
Output:
ANA
RENEE
Upvotes: 0
Reputation: 92854
With single awk:
inp="ANA"
awk -v inp=$inp '{ a[$1]=$2 } END { if(inp in a){ v=a[inp];
for(i in a){ if(a[i]==v) print i }}
}' /usr/local/linuxgym-data/census/femalenames.txt | sort
The output:
ANA
RENEE
a[$1]=$2
- accumulating frequency value for each name
if(inp in a){ v=a[inp];
- if the input name inp
is in array - get its frequency value
for(i in a){ if(a[i]==v) print i
- print all names that have the same frequency value as for input name
Upvotes: 0