Zerg12
Zerg12

Reputation: 317

Unix script is sorting the input

I am having sometime here with my home assignment. Maybe you guys will advise what to read or what commands I can use in order to create the following:

Create a shell script test that will act as follows:

  1. The script will display the following message on the terminal screen: Enter file names (wild cards OK)
  2. The script will read the list of names.
  3. For each file on the list that is a proper file, display a table giving the ten most frequently used words in the file, sorted with the most frequent first. Include the count.
  4. Repeat steps 1-3 over and over until the user indicates end-of-file. This is done by entering the single character Ctrl-d as a file name.

Here is what I have so far:

#!/bin/bash
echo 'Enter file names (wild cards OK)'
read input_source
if test -f "$input_source"
then 

Upvotes: 0

Views: 175

Answers (3)

clt60
clt60

Reputation: 63922

I'm usually ignoring homework questions without showing some progress and effort to learn something - but you're as beautifully cheeky so i'll make an exception.

here is what you want

while read -ep 'Files?> ' files
do
    for file in $files
    do
        echo "== word counts for the $file =="
        tr -cs '[:alnum:]' '\n' < "$file" | sort | uniq -c | tail | sort -nr
    done
done

And now = at least try understand what the above doing...

Ps: voting to close...

Upvotes: 1

glenn jackman
glenn jackman

Reputation: 246827

Some tips:

  1. have access to the complete bash manual: it's daunting at first, but it's an invaluable reference -- http://www.gnu.org/software/bash/manual/bashref.html

  2. You can get help about bash builtins at the command line: try help read

  3. the read command can handle printing the prompt with the -p option (see previous tip)

  4. you'll accomplish the last step with a while loop:

    while read -p "the prompt" filenames; do 
        # ...
    done
    

Upvotes: 1

kojiro
kojiro

Reputation: 77127

How to find the ten most frequently used words in a file

Assumptions:

  1. The files given have one word per line.
  2. The files are not huge, so efficiency isn't a primary concern.

You can use sort and uniq to find the count of non-unique values in a file, then tail to cut off all but the last ten, and reverse-numeric sort to put them in descending order.

sort "$afile" | uniq -c | tail | sort -rd

Upvotes: 1

Related Questions