Leonard
Leonard

Reputation: 13777

Are double square brackets [[ ]] preferable over single square brackets [ ] in Bash?

A coworker claimed recently in a code review that the [[ ]] construct is to be preferred over [ ] in constructs like

if [ "`id -nu`" = "$someuser" ] ; then
     echo "I love you madly, $someuser"
fi

He couldn't provide a rationale. Is there one?

Upvotes: 940

Views: 424914

Answers (11)

Mark Reed
Mark Reed

Reputation: 95335

In a question tagged 'bash' that explicitly has "in Bash" in the title, I'm a little surprised by all of the replies saying you should avoid [[...]] because it only works in bash!

It's true that portability is the primary objection: if you want to write a shell script which works in Bourne-compatible shells even if they aren't bash (or ksh or zsh), you should avoid [[...]]. If you're in that situation and want to test your shell scripts in a more strictly POSIX shell, I recommend dash; though it is a not strictly POSIX (it lacks the internationalization support required by the standard, and does support a few non-POSIX things like local variables), it's much closer than any of the bash/ksh/zsh trinity.

The other objection I see is at least applicable within the assumption of bash: that [[...]] has its own special rules which you have to learn, while [...] acts like just another command. That is again true (and Mr. Santilli brought the receipts showing all the differences), but it's rather subjective whether the differences are good or bad. I personally find it freeing that the double-bracket construct lets me use (...) for grouping, && and || for Boolean logic, < and > for comparison, and unquoted parameter expansions. It's like its own little closed-off world where expressions work more like they do in traditional, non-command-shell programming languages.

A point I haven't seen raised is that this behavior of [[...]] is entirely consistent with that of the arithmetic expansion construct $((...)), which is specified by POSIX, and also allows unquoted parentheses and Boolean and inequality operators (which here perform numeric instead of lexical comparisons). Essentially, any time you see the doubled bracket characters you get the same quote-shielding effect.

(Bash and its modern relatives also use ((...)) – without the leading $ – as either a C-style for loop header or an environment for performing arithmetic operations without substituting the final value; neither syntax is part of POSIX.)

So there are some good reasons to prefer [[...]]; there are also reasons to avoid it, which may or may not be applicable in your environment. As to your coworker, "our style guide says so" is a valid justification, as far as it goes, but I'd also seek out backstory from someone who understands why the style guide recommends what it does.

Upvotes: 23

Gabriel Staples
Gabriel Staples

Reputation: 53085

A coworker claimed recently in a code review that the [[ ]] construct is to be preferred over [ ]
...
He couldn't provide a rationale. Is there one?

Yes, speed and efficiency.

I don't see the word "speed" mentioned anywhere here, but since [[ ]] is a Bash built-in syntax, it doesn't requires spawning a new process. [ on the other hand, is the test command, and running it spawns a new process. So, [[ ]] should be faster than [ ] syntax, because it avoids spawning a new test process every time a [ is encountered.

What makes me think [ spawns a new process and is not a Bash built-in?
Well, when I run which [ it tells me it is located at /usr/bin/[. Note that ] is simply the last argument to the [ command.

Also, a second reason to prefer [[ ]] is that it is more feature-rich I think. It supports a more "C-style" and modern syntax.

I have traditionally favored [ ] over [[ ]] because it's more portable, but recently I think I may switch to [[ ]] over [ ] because it's faster, and I write a lot of Bash.

Speed test results (lower is better)

Here are my results over 2 million iterations: [[ ]] is 1.42x faster than [ ]:

enter image description here

The code being tested was super simple:

if [ "$word1" = "$word2" ]; then
    echo "true"
fi

vs.

if [[ "$word1" == "$word2" ]]; then
    echo "true"
fi

where word1 and word2 were just the following constants, so echo "true" never ran:

word1="true"
word2="false"

If you want to run the test yourself, the code is below. I've embedded a rather sophisticated Python program as a heredoc string into the Bash script to do the plotting.

If you change the comparison string constants to these:

word1="truetruetruetruetruetruetruetruetruetruetruetruetruetruetruetruetru"
word2="truetruetruetruetruetruetruetruetruetruetruetruetruetruetruetruetrufalse"

...then you get these results, where the Bash built-in ([[ ]]) is only 1.22x faster than the test command ([ ]):

enter image description here

In the first case, the strings differ after only 1 char, so the comparison can end immediately. In the latter case, the strings differ after 67 chars, so the comparison takes much longer to identify that the strings differ. I suspect that the longer the strings are, the less the speed differences will be since the majority of the time difference is initially the time it takes to spawn the new [ process, but as the strings match for longer, the process spawn time matters less. That's my suspicion anyway.

speed_tests__comparison_with_test_cmd_vs_double_square_bracket_bash_builtin.sh from my eRCaGuy_hello_world repo:

#!/usr/bin/env bash

# This file is part of eRCaGuy_hello_world: https://github.com/ElectricRCAircraftGuy/eRCaGuy_hello_world

# ==============================================================================
# Python plotting program
# - is a Bash heredoc
# References:
# 1. My `plot_data()` function here:
#    https://github.com/ElectricRCAircraftGuy/eRCaGuy_hello_world/blob/master/python/pandas_dataframe_iteration_vs_vectorization_vs_list_comprehension_speed_tests.py
# 1. See my answer here: https://stackoverflow.com/a/77270285/4561887
# ==============================================================================
python_plotting_program=$(cat <<'PROGRAM_END'

# 3rd-party imports
import matplotlib.pyplot as plt
import pandas as pd

# standard imports
import os
import sys

assert sys.argv[0] == "-c"
# print(f"sys.argv = {sys.argv}")  # debugging

# Get the command-line arguments
FULL_PATH_TO_SCRIPT = sys.argv[1]
NUM_ITERATIONS = int(sys.argv[2])
single_bracket_sec = float(sys.argv[3])
double_bracket_sec = float(sys.argv[4])

# Obtain paths to help save the plot later.
# See my answer: https://stackoverflow.com/a/74800814/4561887
SCRIPT_DIRECTORY = str(os.path.dirname(FULL_PATH_TO_SCRIPT))
FILENAME = str(os.path.basename(FULL_PATH_TO_SCRIPT))
FILENAME_NO_EXTENSION = os.path.splitext(FILENAME)[0]

# place into lists
labels = ['`[ ]` `test` func', '`[[ ]]` Bash built-in']
data = [single_bracket_sec, double_bracket_sec]

# place into a Pandas dataframe for easy manipulation and plotting
df = pd.DataFrame({'test_type': labels, 'time_sec': data})
df = df.sort_values(by="time_sec", axis='rows', ascending=False)
df = df.reset_index(drop=True)

# plot the data
fig = plt.figure()
plt.bar(labels, data)
plt.title(f"Speed Test: `[ ]` vs `[[ ]]` over {NUM_ITERATIONS:,} iterations")
plt.xlabel('Test Type', labelpad=8)  # use `labelpad` to lower the label
plt.ylabel('Time (sec)')

# Prepare to add text labels to each bar
df["text_x"] = df.index # use the indices as the x-positions
df["text_y"] = df["time_sec"] + 0.06*df["time_sec"].max()
df["time_multiplier"] = df["time_sec"] / df["time_sec"].min()
df["text_label"] = (df["time_sec"].map("{:.4f} sec\n".format) +
                    df["time_multiplier"].map("{:.2f}x".format))

# Use a list comprehension to actually call `plt.text()` to **automatically add
# a plot label** for each row in the dataframe
[
    plt.text(
        text_x,
        text_y,
        text_label,
        horizontalalignment='center',
        verticalalignment='center'
    ) for text_x, text_y, text_label
    in zip(
        df["text_x"],
        df["text_y"],
        df["text_label"]
    )
]

# add 10% to the top of the y-axis to leave space for labels
ymin, ymax = plt.ylim()
plt.ylim(ymin, ymax*1.1)

plt.savefig(f"{SCRIPT_DIRECTORY}/{FILENAME_NO_EXTENSION}.svg")
plt.savefig(f"{SCRIPT_DIRECTORY}/{FILENAME_NO_EXTENSION}.png")

plt.show()

PROGRAM_END
)

# ==============================================================================
# Bash speed test program
# ==============================================================================

# See my answer: https://stackoverflow.com/a/60157372/4561887
FULL_PATH_TO_SCRIPT="$(realpath "${BASH_SOURCE[-1]}")"

NUM_ITERATIONS="2000000" # 2 million
# NUM_ITERATIONS="1000000" # 1 million
# NUM_ITERATIONS="10000" # 10k

word1="true"
word2="false"

# Get an absolute timestamp in floating point seconds.
# From:
# https://github.com/ElectricRCAircraftGuy/eRCaGuy_hello_world/blob/master/bash/timestamp_lib_WIP.sh
seconds_float() {
    time_sec="$(date +"%s.%N")"
    echo "$time_sec"
}

single_bracket() {
    for i in $(seq 1 "$NUM_ITERATIONS"); do
        if [ "$word1" = "$word2" ]; then
            echo "true"
        fi
    done
}

double_bracket() {
    for i in $(seq 1 "$NUM_ITERATIONS"); do
        if [[ "$word1" == "$word2" ]]; then
            echo "true"
        fi
    done
}

run_and_time_function() {
    # the 1st arg is the function to run
    func_to_time="$1"

    # NB: the "information" type prints will go to stderr so they don't
    # interfere with the actual timing results printed to stdout.

    echo -e "== $func_to_time time test start... ==" >&2  # to stderr
    time_start="$(seconds_float)"

    $func_to_time

    time_end="$(seconds_float)"
    elapsed_time="$(bc <<< "scale=20; $time_end - $time_start")"
    echo "== $func_to_time time test end. ==" >&2  # to stderr
    echo "$elapsed_time"  # to stdout
}

main() {
    echo "Running speed tests over $NUM_ITERATIONS iterations."

    single_bracket_time_sec="$(run_and_time_function "single_bracket")"
    double_bracket_time_sec="$(run_and_time_function "double_bracket")"

    echo "single_bracket_time_sec = $single_bracket_time_sec"
    echo "double_bracket_time_sec = $double_bracket_time_sec"

    # echo "Plotting the results in Python..."
    python3 -c "$python_plotting_program" \
        "$FULL_PATH_TO_SCRIPT" \
        "$NUM_ITERATIONS" \
        "$single_bracket_time_sec" \
        "$double_bracket_time_sec"
}

# Determine if the script is being sourced or executed (run).
# See:
# 1. "eRCaGuy_hello_world/bash/if__name__==__main___check_if_sourced_or_executed_best.sh"
# 1. My answer: https://stackoverflow.com/a/70662116/4561887
if [ "${BASH_SOURCE[0]}" = "$0" ]; then
    # This script is being run.
    __name__="__main__"
else
    # This script is being sourced.
    __name__="__source__"
fi

# Only run `main` if this script is being **run**, NOT sourced (imported).
# - See my answer: https://stackoverflow.com/a/70662116/4561887
if [ "$__name__" = "__main__" ]; then
    main "$@"
fi

Sample run and output:

eRCaGuy_hello_world$ bash/speed_tests__comparison_with_test_cmd_vs_double_square_bracket_bash_builtin.sh
Running speed tests over 2000000 iterations.
== single_bracket time test start... ==
== single_bracket time test end. ==
== double_bracket time test start... ==
== double_bracket time test end. ==
single_bracket_time_sec = 5.990248014
double_bracket_time_sec = 4.230342635

References

  1. My plot_data() function here, for how to make the bar plot with the sophisticated text above the bars: https://github.com/ElectricRCAircraftGuy/eRCaGuy_hello_world/blob/master/python/pandas_dataframe_iteration_vs_vectorization_vs_list_comprehension_speed_tests.py
    1. My answer with this code: How to iterate over rows in a DataFrame in Pandas
  2. My answer: How do I get the path and name of the python file that is currently executing?
  3. My answer: How to obtain the full file path, full directory, and base filename of any script being run OR sourced...
  4. My Bash library: get an absolute timestamp in floating point seconds: https://github.com/ElectricRCAircraftGuy/eRCaGuy_hello_world/blob/master/bash/timestamp_lib_WIP.sh
  5. How to make a heredoc: https://linuxize.com/post/bash-heredoc/

See also

  1. This answer which links to another super simple speed test

Upvotes: 8

William Pursell
William Pursell

Reputation: 212544

There is an important caveat to using [[ ]], consider:

$ # For integer like strings, [ and [[ behave the same
$ 
$ n=5 # set n to a string that represents an integer
$ [[ $n -ge 0 ]] && printf "\t> $n is non-negative\n"
        > 5 is non-negative
$ [ "$n" -ge 0 ] && printf "\t> $n is non-negative\n"
        > 5 is non-negative
$ 
$ # But the behavior is different in several cases:
$ n=something # set n to some non-numeric string
$ [ "$n" -ge 0 ] && printf "\t> $n is non-negative\n"
-bash: [: something: integer expression expected
$ [[ $n -ge 0 ]] && printf "\t> $n is non-negative\n"
        > something is non-negative
$ n=5foo
$ [[ $n -ge 0 ]] && printf "\t> $n is non-negative\n"
-bash: 5foo: value too great for base (error token is "5foo")

To clarify the inconsistency of the behavior of [[, consider:

$ for n in 5foo 5.2 something; do [[ $n -ge 0 ]] && echo ok; done
-bash: 5foo: value too great for base (error token is "5foo")
-bash: 5.2: syntax error: invalid arithmetic operator (error token is ".2")
ok
$ for n in 5foo 5.2 something; do [ "$n" -ge 0 ] && echo ok; done
-bash: [: 5foo: integer expression expected
-bash: [: 5.2: integer expression expected
-bash: [: something: integer expression expected

Interpreting some non-numeric strings as 0 and the inconsistent error messages (or complete lack of an error message!) from the line [[ $n -ge 0 ]] when $n is not a valid integer makes [[ unsafe to use for integer comparisons. I strongly advise against [[ for numerical comparisons for this reason.

Upvotes: 2

Behavior differences

Some differences on Bash 4.3.11:

  • POSIX vs Bash extension:

  • regular command vs magic

    • [ is just a regular command with a weird name.

      ] is just the last argument of [.

      Ubuntu 16.04 actually has an executable for it at /usr/bin/[ provided by coreutils, but the Bash built-in version takes precedence.

      Nothing is altered in the way that Bash parses the command.

      In particular, < is redirection, && and || concatenate multiple commands, ( ) generates subshells unless escaped by \, and word expansion happens as usual.

    • [[ X ]] is a single construct that makes X be parsed magically. <, &&, || and () are treated specially, and word splitting rules are different.

      There are also further differences like = and =~.

    In Bashese: [ is a built-in command, and [[ is a keyword: What's the difference between shell builtin and shell keyword?

  • <

  • && and ||

    • [[ a = a && b = b ]]: true, logical and
    • [ a = a && b = b ]: syntax error, && parsed as an AND command separator cmd1 && cmd2
    • [ a = a ] && [ b = b ]: POSIX reliable equivalent
    • [ a = a -a b = b ]: almost equivalent, but deprecated by POSIX because it is insane and fails for some values of a or b like ! or ( which would be interpreted as logical operations
  • (

    • [[ (a = a || a = b) && a = b ]]: false. Without ( ) it would be true, because [[ && ]] has greater precedence than [[ || ]]
    • [ ( a = a ) ]: syntax error, () is interpreted as a subshell
    • [ \( a = a -o a = b \) -a a = b ]: equivalent, but (), -a, and -o are deprecated by POSIX. Without \( \) it would be true, because -a has greater precedence than -o
    • { [ a = a ] || [ a = b ]; } && [ a = b ] non-deprecated POSIX equivalent. In this particular case however, we could have written just: [ a = a ] || [ a = b ] && [ a = b ], because the || and && shell operators have equal precedence, unlike [[ || ]] and [[ && ]] and -o, -a and [
  • word splitting and filename generation upon expansions (split+glob)

    • x='a b'; [[ $x = 'a b' ]]: true. Quotes are not needed
    • x='a b'; [ $x = 'a b' ]: syntax error. It expands to [ a b = 'a b' ]
    • x='*'; [ $x = 'a b' ]: syntax error if there's more than one file in the current directory.
    • x='a b'; [ "$x" = 'a b' ]: POSIX equivalent
  • =

    • [[ ab = a? ]]: true, because it does pattern matching (* ? [ are magic). Does not glob expand to files in the current directory.
    • [ ab = a? ]: a? glob expands. So it may be true or false depending on the files in the current directory.
    • [ ab = a\? ]: false, not glob expansion
    • = and == are the same in both [ and [[, but == is a Bash extension.
    • case ab in (a?) echo match; esac: POSIX equivalent
    • [[ ab =~ 'ab?' ]]: false, loses magic with '' in Bash 3.2 and above and provided compatibility to Bash 3.1 is not enabled (like with BASH_COMPAT=3.1)
    • [[ ab? =~ 'ab?' ]]: true
  • =~

    • [[ ab =~ ab? ]]: true. POSIX extended regular expression match and ? does not glob expand
    • [ a =~ a ]: syntax error. No Bash equivalent.
    • printf 'ab\n' | grep -Eq 'ab?': POSIX equivalent (single-line data only)
    • awk 'BEGIN{exit !(ARGV[1] ~ ARGV[2])}' ab 'ab?': POSIX equivalent.

Recommendation: always use []

There are POSIX equivalents for every [[ ]] construct I've seen.

If you use [[ ]] you:

  • lose portability
  • force the reader to learn the intricacies of another Bash extension. [ is just a regular command with a weird name, and no special semantics are involved.

Thanks to Stéphane Chazelas for important corrections and additions.

Upvotes: 423

Vicente Bolea
Vicente Bolea

Reputation: 1583

A typical situation where you cannot use [[ is in an autotools configure.ac script. There brackets have a special and different meaning, so you will have to use test instead of [ or [[ -- Note that test and [ are the same program.

Upvotes: 6

scavenger
scavenger

Reputation: 408

[[ ]] double brackets are unsupported under certain versions of SunOS and totally unsupported inside function declarations by:

GNU Bash, version 2.02.0(1)-release (sparc-sun-solaris2.6)

Upvotes: 0

f3lix
f3lix

Reputation: 29875

From Which comparator, test, bracket, or double bracket, is fastest?:

The double bracket is a “compound command” where as test and the single bracket are shell built-ins (and in actuality are the same command). Thus, the single bracket and double bracket execute different code.

The test and single bracket are the most portable as they exist as separate and external commands. However, if your using any remotely modern version of BASH, the double bracket is supported.

Upvotes: 24

anon
anon

Reputation:

[[ ]] has more features - I suggest you take a look at the Advanced Bash Scripting Guide for more information, specifically the extended test command section in Chapter 7. Tests.

Incidentally, as the guide notes, [[ ]] was introduced in ksh88 (the 1988 version of KornShell).

Upvotes: 74

crizCraig
crizCraig

Reputation: 8917

If you are into following Google's style guide:

Test, [ … ], and [[ … ]]

[[ … ]] is preferred over [ … ], test and /usr/bin/[.

[[ … ]] reduces errors as no pathname expansion or word splitting takes place between [[ and ]]. In addition, [[ … ]] allows for regular expression matching, while [ … ] does not.

# This ensures the string on the left is made up of characters in
# the alnum character class followed by the string name.
# Note that the RHS should not be quoted here.
if [[ "filename" =~ ^[[:alnum:]]+name ]]; then
  echo "Match"
fi

# This matches the exact pattern "f*" (Does not match in this case)
if [[ "filename" == "f*" ]]; then
  echo "Match"
fi
# This gives a "too many arguments" error as f* is expanded to the
# contents of the current directory
if [ "filename" == f* ]; then
  echo "Match"
fi

For the gory details, see E14 at http://tiswww.case.edu/php/chet/bash/FAQ

Upvotes: 24

Johannes Schaub - litb
Johannes Schaub - litb

Reputation: 507243

[[ has fewer surprises and is generally safer to use. But it is not portable - POSIX doesn't specify what it does and only some shells support it (beside bash, I heard ksh supports it too). For example, you can do

[[ -e $b ]]

to test whether a file exists. But with [, you have to quote $b, because it splits the argument and expands things like "a*" (where [[ takes it literally). That has also to do with how [ can be an external program and receives its argument just normally like every other program (although it can also be a builtin, but then it still has not this special handling).

[[ also has some other nice features, like regular expression matching with =~ along with operators like they are known in C-like languages. Here is a good page about it: What is the difference between test, [ and [[ ? and Bash Tests

Upvotes: 886

unix4linux
unix4linux

Reputation: 67

In a nutshell, [[ is better because it doesn't fork another process. No brackets or a single bracket is slower than a double bracket because it forks another process.

Upvotes: -1

Related Questions