Reputation: 996
I have an output that looks like this: (number of occurrences of the word, and the word)
3 I
2 come
2 from
1 Slovenia
But I want that it looked like this:
I 3
come 2
from 2
Slovenia 1
I got my output with:
cut -d' ' -f1 "file" | uniq -c | sort -nr
I tried to do different things, with another pipes:
cut -d' ' -f1 "file" | uniq -c | sort -nr | cut -d' ' -f8 ...?
which is a good start, because I have the words on the first place..buuut I have no access to the number of occurrences?
AWK and SED are not allowed!
EDIT: alright lets say the file looks like this.
I ....
come ...
from ...
Slovenia ...
I ...
I ....
come ...
from ....
I is repeated 3 times, come twice, from twice, Slovenia once. +They are on beginning of each line.
Upvotes: 1
Views: 1456
Reputation: 20980
If perl is allowed:
$ cat testfile
I ....
come ...
from ...
Slovenia ...
I ...
I ....
come ...
from ....
$ perl -e 'my %list;
while(<>){
chomp; #strip \n from the end
s/^ *([^ ]*).*/$1/; #keep only 1st word
$list{$_}++; #increment count
}
foreach (keys %list){
print "$_ $list{$_}\n";
}' < testfile
come 2
Slovenia 1
I 3
from 2
Upvotes: 0
Reputation: 113844
AWK and SED are not allowed!
Starting with this:
$ cat file
3 I
2 come
2 from
1 Slovenia
The order can be reversed with this:
$ while read count word; do echo "$word $count"; done <file
I 3
come 2
from 2
Slovenia 1
Let us start with:
$ cat file2
I ....
come ...
from ...
Slovenia ...
I ...
I ....
come ...
from ....
Using your pipeline (with two changes) combined with the while
loop:
$ cut -d' ' -f1 "file2" | sort | uniq -c | sort -snr | while read count word; do echo "$word $count"; done
I 3
come 2
from 2
Slovenia 1
The one change that I made to the pipeline was to put a sort
before uniq -c
. This is because uniq -c
assumes that its input is sorted. The second change is to add the -s
option to the second sort so that the alphabetical order of the words with the same count is not lost
Upvotes: 3
Reputation: 27216
You can just pipe an awk
after your first try:
$ cat so.txt
3 I
2 come
2 from
1 Slovenia
$ cat so.txt | awk '{ print $2 " " $1}'
I 3
come 2
from 2
Slovenia 1
Upvotes: 0