Reputation: 1342

Count number of lines under each header in a text file using bash shell script

I can do this easily in python or some other high level language. What I am interested in is doing this with bash.

Here is the file format:

head-xyz
item1
item2
item3
head-abc
item8
item5
item6
item9

What I would like to do is print the following output:

head-xyz: 3
head-abc: 4

header will have a specific pattern similar to the example i gave above. items also have specific patterns like in the example above. I am only interested in the count of items under each header.

Upvotes: 1

Answers (3)

hek2mgl

Reputation: 158160

You can use awk:

awk '/head/{h=$0}{c[h]++}END{for(i in c)print i, c[i]-1}' input.file

Breakdown:

/head/{h=$0}

For every line matching /head/, set variable h to record the header.
{c[h]++}

For every line in the file, update the array c, which stores a map from header string to line count.
END{for(i in c)print i, c[i]-1}

At the end, loop through the keys in array c and print the key (header) followed by the value (count). Subtract one to avoid counting the header itself.

Upvotes: 5

Jonathan Ross

Reputation: 550

If you don't consider sed a high-level language, here's another approach:

for file in head-*; do
    echo "$file: \c"
    sed -n '/^head-/,${
        /^head-/d
        /^item[0-9]/!q
        p
    }
    ' <$file | wc -l
done

In English, the sed script does

Don't print by default
Within lines matching /^head-/ to end of file
- Delete the "head line"
- After that, quit if you find a non-item line
- Otherwise, print the line

And wc -l to count lines.

Upvotes: 0

Michal Gasek

Reputation: 6423

Note: Bash version 4 only (uses associative arrays)

#!/usr/bin/env bash

FILENAME="$1"
declare -A CNT

while read -r LINE || [[ -n $LINE ]]
do
    if [[ $LINE =~ ^head ]]; then HEADLINE="$LINE"; fi
    if [ ${CNT[$HEADLINE]+_} ];
    then
        CNT[$HEADLINE]=$(( ${CNT[$HEADLINE]} + 1 ))
    else
        CNT[$HEADLINE]=0
    fi
done < "$FILENAME"

for i in "${!CNT[@]}"; do echo "$i: ${CNT[$i]}"; done

Output:

$ bash countitems.sh input
head-abc: 4
head-xyz: 3

Does this answer your question @powerrox ?

Upvotes: 3

Count number of lines under each header in a text file using bash shell script

Answers (3)

Related Questions