5Ermacs
5Ermacs

Reputation: 15

Bash script beginner questions: Looping, Arrays and character checks

I'm taking a class that has us programming scripts in bash, and I am very new to bash.

My professor gave us an assignment: "Write a Bash script, count.sh, which builds a table of counts for the commands under /bin which start with each letter. For example, if there are 3 commands starting with "a" (alsaumute, arch & awk) while there may be 2 commands starting with "z" (zcat & zsh). The first and last lines your script will print would be: a 3 ... z 2"

So I've been working on this problem and I'm able to set up the loop, but I'm fuzzy as to how I'm supposed to retrieve the bin commands(I'm assuming bin is a file with just a list of commands for bash?) and then perform a character check on the first character?

He told me to use ls and grep as a hint. I looked up ls (list files/directories) and grep search for a specific text, so I assume I use ls to get the bin commands somehow and then perform a grep on them in the loop?

declare -a letters=('a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm' 'n' 'o' 'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' 'z')

counter=0
         while [  $counter -lt 26 ]; do
            current=${letters[$counter]}
             echo The counter is $counter
             echo $current
             let counter=counter+1 
         done

That's where I am so far, so my guess is I create a a variable array to also hold all the bin commands(using ls right?), and then use that in the loop with grep?

I just want to get some advice that I'm on the right path. I'm very new to linux and I've never dealt with scripts or cmd line type stuff.

Here's the output so far

Upvotes: 1

Views: 210

Answers (3)

Charles Duffy
Charles Duffy

Reputation: 295472

If using bash 4.0 or newer, the following is an implementation that uses only functionality built into the shell -- no ls, no cut, no grep, no awk, etc.

#!/usr/bin/env bash
#              ^^^^- must be run with bash, 4.0 or newer; NOT /bin/sh.

declare -A counts=( )            # declare an associative array mapping letters to counts
for entry in /bin/*; do          # use a glob to list filenames in /bin
  filename=${entry##*/}          # strip the path off the beginning of each name
  first_char=${filename:0:1}     # take the first character of what's left
  (( counts[$first_char] += 1 )) # and update the associative array's counter
done

# ...then, iterate over the keys in the associative array...
for first_char in "${!counts[@]}"; do
  # ...and print them alongside their associated values (the counts)
  printf '%s %s\n' "$first_char" "${counts[$first_char]}"
done

You can also view the associative array this builds with declare -p counts; it'll look something like the following (taken from a system where most interesting things are in /usr/bin rather than /bin, so rather sparse in the below example):

declare -A counts=([b]="1" [c]="4" [d]="4" [e]="3" [h]="1" [k]="2" [l]="4" [m]="2" [p]="3" [r]="2" [s]="4" [t]="2" [u]="1" [w]="1" [z]="1" )

Some notes:

  • As a general rule, shell builtins are (much!) faster than launching an external tool to process a single value, but slower than having an external tool process a long list of values. Command substitutions and pipelines both launch subprocesses -- requiring the shell to create new copies of itself in memory, with some of those copies thereafter replacing themselves with instances of an external executable file. This is a substantial amount of overhead, and not something you want to do inside a tight loop.
  • ls, in particular, was designed and built for human consumption rather than machine parsing of its output. (Since ls emits its output in newline-separated form, and filenames on common UNIX systems can contain literal newlines, it's impossible for ls to represent all possible filenames in literal form!). Avoid its use in scripts and other scenarios where the goal is something other than human consumption.
  • declare -A is what makes an array associative, meaning that its keys can be arbitrary strings, rather than only positive integers. This is a relatively new feature, and is why the above is only compatible with bash 4.0 or newer.
  • "${array[@]}" iterates over values for that array, whereas "${!array[@]}" iterates over its keys. As you already know, "${array[$key]}" can be used to map from a key to the associated value. See the relevant bash-hackers page for more on arrays, or BashFAQ #5.
  • ${entry##*/} is a parameter expansion -- one of the most powerful tools for native string manipulation in bash. This particular one trims everything from the beginning to the last / found. ${filename:0:1} is another, taking one character from the filename starting at position 0.
  • (( )) creates an arithmetic context, where native C-style math syntax is available (for integer math only).

Upvotes: 4

Richard Hamilton
Richard Hamilton

Reputation: 26444

To get a list of commands, you can use compgen -c. Pipe that to grep, to find the commands that start with a letter. To get the counts, you can pipe that output to awk, and use the NF variable to find the number of commands.

declare -a letters=('a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j' 'k' 'l' 'm' 'n' 'o' 'p' 'q' 'r' 's' 't' 'u' 'v' 'w' 'x' 'y' 'z')

for current in "${letters[@]}"; do
    echo "$(compgen -c | grep "^$current" | wc -l) $current"
done

Upvotes: 1

Ipor Sircer
Ipor Sircer

Reputation: 3141

happy lazyness!

ls -1 /bin|cut -c1|uniq -c
  1 a
 27 b
 10 c
 ...

Upvotes: 1

Related Questions