Reputation: 1491
I have file like below :
this is a sample file this file will be used for testing
this is a sample file
this file will be used for testing
I want to count the words using AWK.
the expected output is
this 2
is 1
a 1
sample 1
file 2
will 1
be 1
used 1
for 1
the below AWK I have written but getting some errors
cat anyfile.txt|awk -F" "'{for(i=1;i<=NF;i++) a[$i]++} END {for(k in a) print k,a[k]}'
Upvotes: 6
Views: 27366
Reputation: 3451
Here is Perl code which provides similar sorted output to Jotne's awk solution:
perl -ne 'for (split /\s+/, $_){ $w{$_}++ }; END{ for $key (sort keys %w) { print "$key $w{$key}\n"}}' testfile
$_
is the current line, which is split based on whitespace /\s+/
Each word is then put into $_
The %w
hash stores the number of occurrences of each word
After the entire file is processed, the END{}
block is run
The keys of the %w
hash are sorted alphabetically
Each word $key
and number of occurrences $w{$key}
is printed
Upvotes: 0
Reputation: 2771
Instead of looping each line and saving the word in array ({for(i=1;i<=NF;i++) a[$i]++}
) use gawk with multi-char RS (Record Separator) definition support option and save each field in array as following(It's a little bit fast):
gawk '{a[$0]++} END{for (k in a) print k,a[k]}' RS='[[:space:]]+' file
Output:
used 1
this 2
be 1
a 1
for 1
testing 1
file 2
will 1
sample 1
is 1
In above gawk command I defines space-character-class [[:space:]]+
(including one or more spaces or \n
ew line character) as record separator.
Upvotes: 2
Reputation: 41460
It works fine for me:
awk '{for(i=1;i<=NF;i++) a[$i]++} END {for(k in a) print k,a[k]}' testfile
used 1
this 2
be 1
a 1
for 1
testing 1
file 2
will 1
sample 1
is 1
PS you do not need to set -F" "
, since its default any blank.
PS2, do not use cat
with programs that can read data itself, like awk
You can add sort
behind code to sort it.
awk '{for(i=1;i<=NF;i++) a[$i]++} END {for(k in a) print k,a[k]}' testfile | sort -k 2 -n
a 1
be 1
for 1
is 1
sample 1
testing 1
used 1
will 1
file 2
this 2
Upvotes: 12