Reputation: 536
I have a uniq -c output, that outputs about 7-10 lines with the count of each pattern that was repeated for each unique line pattern. I want to store the output of my uniq -c file.txt into a bash array. Right now all I can do is store the output into a variable and print it. However, bash currently thinks the entire output is just one big string.
How does bash recognize delimiters? How do you store UNIX shell command output as Bash arrays?
Here is my current code:
proVar=`awk '{printf ("%s\t\n"), $1}' file.txt | grep -P 'pattern' | uniq -c`
echo $proVar
And current output I get:
587 chr1 578 chr2 359 chr3 412 chr4 495 chr5 362 chr6 287 chr7 408 chr8 285 chr9 287 chr10 305 chr11 446 chr12 247 chr13 307 chr14 308 chr15 365 chr16 342 chr17 245 chr18 252 chr19 210 chr20 193 chr21 173 chr22 145 chrX 58 chrY
Here is what I want:
proVar[1] = 2051
proVar[2] = 1243
proVar[3] = 1068
...
proVar[22] = 814
proVar[X] = 72
proVar[Y] = 13
In the long run, I'm hoping to make a barplot based on the counts for each index, where every 50 counts equals one "=" sign. It will hopefully look like the below
chr1 ===========
chr2 ===========
chr3 =======
chr4 =========
...
chrX ==
chrY =
Any help, guys?
Upvotes: 2
Views: 990
Reputation: 437743
To build the associative array, try this:
declare -A proVar
while read -r val key; do
proVar[${key#chr}]=$val
done < <(awk '{printf ("%s\t\n"), $1}' file.txt | grep -P 'pattern' | uniq -c)
Note: This assumes that your command's output is composed of multiple lines, each containing one key-value pair; the single-line output shown in your question comes from passing $proVar
to echo without double quotes.
while
loop to read each output line from a process substitution (<(...)
).chr
from each input line's first whitespace-separated token, whereas the value is the rest of the line (after the separating space).To then create the bar plot, use:
while IFS= read -r key; do
echo "chr${key} $(printf '=%.s' $(seq $(( ${proVar[$key]} / 50 ))))"
done < <(printf '%s\n' "${!proVar[@]}" | sort -n)
Note: Using sort -n
to sort the keys will put non-numeric keys such as X
and Y
before numeric ones in the output.
$(( ${proVar[$key]} / 50 ))
calculates the number of =
chars. to display, using integer division in an arithmetic expansion.$(seq ...)
is to simply create as many tokens (arguments) as =
chars. should be displayed (the tokens created are numbers, but their content doesn't matter).printf '=%.s' ...
is a trick that effectively prints as many =
chars. as there are arguments following the format string.printf '%s\n' "${!proVar[@]}" | sort -n
sorts the keys of the assoc. array numerically, and its output is fed via a process substitution to the while
loop, which therefore iterates over the keys in sorted order.Upvotes: 3
Reputation: 780984
You can create an array in an assignment using parentheses:
proVar=(`awk '{printf ("%s\t\n"), $1}' file.txt | grep -P 'pattern' | uniq -c`)
There's no built-in way to create an associative array directly from input. For that you'll need an additional loop.
Upvotes: 0