Reputation: 85252
What is the equivalent of Python dictionaries but in Bash (should work across OS X and Linux).
Upvotes: 863
Views: 776809
Reputation: 3447
and dictionaries seems non-universal. I.e. they cannot be used everywhere.
My dictionary is a function (dict1
), which will treat the first paramter as key
function dict1() {
case $key in
"key1")echo "value1";;
"key2")echo "value2";;
echo ""
function foo() {
local callback_function="$1"
# access `key1` from the "dictionary", aka function
echo $("$callback_function" "key1")
# access `key2` from the "dictionary", aka function
echo $($callback_function "key2")
# pass the "dictionary" to a function `foo`
foo dict1
Upvotes: 0
Reputation: 6855
Just use the file system: the file system is a tree structure that can be used as a hash map. Your hash table will be a temporary directory, your keys will be filenames, and your values will be file contents. The advantage is that it can handle huge hashmaps, and doesn't require a specific shell.
hashtable=$(mktemp -d)
echo $value > "$hashtable/$key"
value=$(< "$hashtable/$key")
Of course, its slow, but not that slow. I tested it on my machine, with an SSD and btrfs, and it does around 3000 element read/write per second.
Upvotes: 42
Reputation: 56855
Here's a rather contrived but hopefully instructive hash/map/dictionary/associative array example. Let's say I have an array of strings and I'd like to create a mapping from each word to a count of times it appears in the array.
Sure, there are ways to do this with piped commands, but the point is to illustrate the core map operations: checking existence of a key with -v
, adding key-value mappings, retrieving the value for a key, updating the existing value for a key and looping over the whole map to print the key-value pairs.
set -o pipefail
bash --version | head -1
words=(foo foo bar bar foo baz baz foo bar)
declare -A counter=() # create the map
for word in "${words[@]}"; do
# if the key doesn't yet exist in the map, add it
if [[ ! -v counter[$word] ]]; then
# look up the value of a key, add one, and store back in the map
counter[$word]=$((${counter[$word]} + 1))
# iterate the map
for key in "${!counter[@]}"; do
echo "$key ${counter[$key]}"
GNU bash, version 5.1.16(1)-release (x86_64-pc-linux-gnu)
foo 4
bar 3
baz 2
Upvotes: 2
Reputation: 643
You can further modify the hput()/hget() interface so that you have named hashes as follows:
hput() {
eval "$1""$2"='$3'
hget() {
eval echo '${'"$1$2"'#hash}'
and then
hput capitals France Paris
hput capitals Netherlands Amsterdam
hput capitals Spain Madrid
echo `hget capitals France` and `hget capitals Netherlands` and `hget capitals Spain`
This lets you define other maps that don't conflict (e.g., 'rcapitals' which does country lookup by capital city). But, either way, I think you'll find that this is all pretty terrible, performance-wise.
EDIT: Modified version of above that supports keys with non-alphanumeric characters
hashKey() {
# replace non-alphanumeric characters with underscore to make keys valid BASH identifiers
echo "$1_$2" | sed -E "s/[^a-zA-Z0-9]+/_/g" | sed -E "s/^[^a-zA-Z0-9]+|[^a-zA-Z0-9]+\$//g"
hashPut() {
local KEY=`hashKey $1 $2`
eval "$KEY"="$3"
hashGet() {
local KEY=`hashKey $1 $2`
echo "${!KEY}"
If you really want fast hash lookup, there's a terrible, terrible hack that actually works really well. It is this: write your key/values out to a temporary file, one-per line, then use 'grep "^$key"' to get them out, using pipes with cut or awk or sed or whatever to retrieve the values.
Like I said, it sounds terrible, and it sounds like it ought to be slow and do all sorts of unnecessary IO, but in practice it is very fast (disk cache is awesome, ain't it?), even for very large hash tables. You have to enforce key uniqueness yourself, etc. Even if you only have a few hundred entries, the output file/grep combo is going to be quite a bit faster - in my experience several times faster. It also eats less memory.
Here's one way to do it:
hinit() {
rm -f /tmp/hashmap.$1
hput() {
echo "$2 $3" >> /tmp/hashmap.$1
hget() {
grep "^$2 " /tmp/hashmap.$1 | awk '{ print $2 };'
hinit capitals
hput capitals France Paris
hput capitals Netherlands Amsterdam
hput capitals Spain Madrid
echo `hget capitals France` and `hget capitals Netherlands` and `hget capitals Spain`
Upvotes: 32
Reputation: 125376
Bash 4 natively supports this feature. Make sure your script's hashbang is #!/usr/bin/env bash
or #!/bin/bash
so you don't end up using sh
. Make sure you're either executing your script directly, or execute script
with bash script
. (Not actually executing a Bash script with Bash does happen, and will be really confusing!)
You declare an associative array by doing:
declare -A animals
You can fill it up with elements using the normal array assignment operator. For example, if you want to have a map of animal[sound(key)] = animal(value)
animals=( ["moo"]="cow" ["woof"]="dog")
Or declare and instantiate in one line:
declare -A animals=( ["moo"]="cow" ["woof"]="dog")
Then use them just like normal arrays. Use
to set value
to expand the values
(notice the !
) to expand the keys
Don't forget to quote them:
echo "${animals[moo]}"
for sound in "${!animals[@]}"; do echo "$sound - ${animals[$sound]}"; done
Before bash 4, you don't have associative arrays. Do not use eval
to emulate them. Avoid eval
like the plague, because it is the plague of shell scripting. The most important reason is that eval
treats your data as executable code (there are many other reasons too).
First and foremost: Consider upgrading to bash 4. This will make the whole process much easier for you.
If there's a reason you can't upgrade, declare
is a far safer option. It does not evaluate data as bash code like eval
does, and as such does not allow arbitrary code injection quite so easily.
Let's prepare the answer by introducing the concepts:
First, indirection.
$ animals_moo=cow; sound=moo; i="animals_$sound"; echo "${!i}"
Secondly, declare
$ sound=moo; animal=cow; declare "animals_$sound=$animal"; echo "$animals_moo"
Bring them together:
# Set a value:
declare "array_$index=$value"
# Get a value:
arrayGet() {
local array=$1 index=$2
local i="${array}_$index"
printf '%s' "${!i}"
Let's use it:
$ sound=moo
$ animal=cow
$ declare "animals_$sound=$animal"
$ arrayGet animals "$sound"
Note: declare
cannot be put in a function. Any use of declare
inside a bash function turns the variable it creates local to the scope of that function, meaning we can't access or modify global arrays with it. (In bash 4 you can use declare -g
to declare global variables - but in bash 4, you can use associative arrays in the first place, avoiding this workaround.)
declare -A
for associative arrays.declare
option if you can't upgrade.awk
instead and avoid the issue altogether.Upvotes: 1438
Reputation: 16108
A coworker just mentioned this thread. I've independently implemented hash tables within bash, and it's not dependent on version 4. From a blog post of mine in March 2010 (before some of the answers here...) entitled Hash tables in bash:
I previously used cksum
to hash but have since translated Java's string hashCode to native bash/zsh.
# Here's the hashing function
ht() {
local h=0 i
for (( i=0; i < ${#1}; i++ )); do
let "h=( (h<<5) - h ) + $(printf %d \'${1:$i:1})"
let "h |= h"
printf "$h"
# Example:
myhash[`ht foo bar`]="a value"
myhash[`ht baz baf`]="b value"
echo ${myhash[`ht baz baf`]} # "b value"
echo ${myhash[@]} # "a value b value" though perhaps reversed
echo ${#myhash[@]} # "2" - there are two values (note, zsh doesn't count right)
It's not bidirectional, and the built-in way is a lot better, but neither should really be used anyway. Bash is for quick one-offs, and such things should quite rarely involve complexity that might require hashes, except perhaps in your ~/.bashrc
and friends.
Upvotes: 5
Reputation: 963
Consider a solution using the bash builtin read as illustrated within the code snippet from a ufw firewall script that follows. This approach has the advantage of using as many delimited field sets (not just 2) as are desired. We have used the | delimiter because port range specifiers may require a colon, ie 6001:6010.
#!/usr/bin/env bash
readonly connections=(
function set_connections(){
local range proto port
for fields in ${connections[@]}
IFS=$'|' read -r range proto port <<< "$fields"
ufw allow from "$range" proto "$proto" to any port "$port"
Upvotes: 25
Reputation: 4293
I also used the bash4 way but I find and annoying bug.
I needed to update dynamically the associative array content so i used this way:
for instanceId in $instanceList
aws cloudwatch describe-alarms --output json --alarm-name-prefix $instanceId| jq '.["MetricAlarms"][].StateValue'| xargs | grep -E 'ALARM|INSUFFICIENT_DATA'
[ $? -eq 0 ] && statusCheck+=([$instanceId]="checkKO") || statusCheck+=([$instanceId]="allCheckOk"
I find out that with bash 4.3.11 appending to an existing key in the dict resulted in appending the value if already present. So for example after some repetion the content of the value was "checkKOcheckKOallCheckOK" and this was not good.
No problem with bash 4.3.39 where appenging an existent key means to substisture the actuale value if already present.
I solved this just cleaning/declaring the statusCheck associative array before the cicle:
unset statusCheck; declare -A statusCheck
Upvotes: 0
Reputation: 2072
This is what I was looking for here:
declare -A hashmap
echo "${hashmap["key"]}"
for key in ${!hashmap[@]}; do echo $key; done
for value in ${hashmap[@]}; do echo $value; done
echo hashmap has ${#hashmap[@]} elements
This did not work for me with bash 4.1.5:
animals=( ["moo"]="cow" )
Upvotes: 141
Reputation: 580
I agree with @lhunath and others that the associative array are the way to go with Bash 4. If you are stuck to Bash 3 (OSX, old distros that you cannot update) you can use also expr, which should be everywhere, a string and regular expressions. I like it especially when the dictionary is not too big.
Write your map as a string (note the separator ',' also at beginning and end)
Use a regex to extract the values
get_animal {
echo "$(expr "$animals" : ".*,$1:\([^,]*\),.*")"
Split the string to list the items
get_animal_items {
arr=$(echo "${animals:1:${#animals}-2}" | tr "," "\n")
for i in $arr
echo "${value} likes to $key"
Now you can use it:
$ animal = get_animal "moo"
$ get_animal_items
cow likes to moo
dog likes to woof
Upvotes: 8
Reputation: 1579
Bash 3 solution:
In reading some of the answers I put together a quick little function I would like to contribute back that might help others.
# Define a hash like this
# Function to get value by key
declare -a hash=("${!1}")
local key
local lookup=$2
for key in "${hash[@]}" ; do
if [[ $KEY == $lookup ]]
echo $VALUE
# Function to get a list of all keys
declare -a hash=("${!1}")
local KEY
local VALUE
local key
local lookup=$2
for key in "${hash[@]}" ; do
keys+="${KEY} "
echo $keys
# Here we want to get the value of 'lastName'
echo $(getHashKey MYHASH[@] "lastName")
# Here we want to get all keys
echo $(getHashKeys MYHASH[@])
Upvotes: 2
Reputation: 4097
There's parameter substitution, though it may be un-PC as well indirection.
# Array pretending to be a Pythonic dictionary
ARRAY=( "cow:moo"
"bash:rock" )
for animal in "${ARRAY[@]}" ; do
printf "%s likes to %s.\n" "$KEY" "$VALUE"
printf "%s is an extinct animal which likes to %s\n" "${ARRAY[1]%%:*}" "${ARRAY[1]##*:}"
The BASH 4 way is better of course, but if you need a hack ...only a hack will do. You could search the array/hash with similar techniques.
Upvotes: 167
Reputation: 2447
I really liked Al P's answer but wanted uniqueness enforced cheaply so I took it one step further - use a directory. There are some obvious limitations (directory file limits, invalid file names) but it should work for most cases.
hinit() {
rm -rf /tmp/hashmap.$1
mkdir -p /tmp/hashmap.$1
hput() {
printf "$3" > /tmp/hashmap.$1/$2
hget() {
cat /tmp/hashmap.$1/$2
hkeys() {
ls -1 /tmp/hashmap.$1
hdestroy() {
rm -rf /tmp/hashmap.$1
hinit ids
for (( i = 0; i < 10000; i++ )); do
hput ids "key$i" "value$i"
for (( i = 0; i < 10000; i++ )); do
printf '%s\n' $(hget ids "key$i") > /dev/null
hdestroy ids
It also performs a tad bit better in my tests.
$ time bash
real 0m46.500s
user 0m16.767s
sys 0m51.473s
$ time bash
real 0m35.875s
user 0m8.002s
sys 0m24.666s
Just thought I'd pitch in. Cheers!
Edit: Adding hdestroy()
Upvotes: 6
Reputation: 31
Two things, you can use memory instead of /tmp in any kernel 2.6 by using /dev/shm (Redhat) other distros may vary. Also hget can be reimplemented using read as follows:
function hget {
while read key idx
if [ $key = $2 ]
echo $idx
done < /dev/shm/hashmap.$1
In addition by assuming that all keys are unique, the return short circuits the read loop and prevents having to read through all entries. If your implementation can have duplicate keys, then simply leave out the return. This saves the expense of reading and forking both grep and awk. Using /dev/shm for both implementations yielded the following using time hget on a 3 entry hash searching for the last entry :
hget() {
grep "^$2 " /dev/shm/hashmap.$1 | awk '{ print $2 };'
$ time echo $(hget FD oracle)
real 0m0.011s
user 0m0.002s
sys 0m0.013s
$ time echo $(hget FD oracle)
real 0m0.004s
user 0m0.000s
sys 0m0.004s
on multiple invocations I never saw less then a 50% improvement.
This can all be attributed to fork over head, due to the use of /dev/shm
Upvotes: 3
Reputation: 77059
Prior to bash 4 there is no good way to use associative arrays in bash. Your best bet is to use an interpreted language that actually has support for such things, like awk. On the other hand, bash 4 does support them.
As for less good ways in bash 3, here is a reference than might help:
Upvotes: 2
Reputation: 146053
hput () {
eval hash"$1"='$2'
hget () {
eval echo '${hash'"$1"'#hash}'
hput France Paris
hput Netherlands Amsterdam
hput Spain Madrid
echo `hget France` and `hget Netherlands` and `hget Spain`
$ sh
Paris and Amsterdam and Madrid
Upvotes: 20