Reputation: 73722
I run across many shell scripts with variables in all caps, and I've always thought that there is a severe misunderstanding with that. My understanding is that, by convention (and perhaps by necessity long ago), environment variables are in all-caps.
But in modern scripting environments like Bash, I have always preferred the convention of lower-case names for temporary variables, and upper-case ones only for exported (i.e. environment) variables. For example:
#!/usr/bin/env bash
year=$(date +%Y)
echo "It is $year."
export JAVA_HOME="$HOME/java"
That has always been my take on things. Are there any authoritative sources which either agree or disagree with this approach?
Upvotes: 313
Views: 103843
Reputation: 629
Let's be clear on our terminology. Environment variables are those variables set by the Bash environment (e.g. ${0}, ${SHELL}, ${SECONDS}, etc.) and which do not need to be set by the user. User Variables (and Constants) are set by the user either in their .bash_profile, .bash_rc, or in a particular script file. User variables can be exported to the environment to become Environment variables; however, unless exported, the scope of User variable is limited to the current interpreter execution (either the shell environment or the executing shell script [i.e. will not be passed to any child] environment). If an Environment variable is unset, or reset, it will usually lose any special meaning or value.
In my 30+ years writing shell scripts, doing Build and Release and some System Administration, I've seen all of the aforementioned variable styles. Unix allows variable names composed of the majuscule and minuscule characters or any mix of the two sets, Linux adopted this same abomination for some unknown reason, probably portability. Posix strongly encourages the use of the majuscule character set as do almost all texts on Bash programming. My conclusion is that this is a convention that is widely adopted and used, but is not strictly required and you are free to make any poor choice you wish.
That said, there are some conventions that are used because of their utility and because they help programmers efficiently and effectively develop useful and maintainable code. When I write bash code:
I use majuscule characters and the '_' characters for all variable and constant names.
I typeset (AKA define) and initialize all variables (and constants) and specify the variable type (integer, read only, exported, array, hash, etc.) that are local to scripts and functions (no everything does not need to be global in Bash).
I use '{' and '}' characters around all variables (syntactically required or not, to avoid unintentional naming errors, which I have seen in practice) and makes the Variable/Constant stand out.
I always use "#!/usr/bin/env bash" now, and previously always used "#!/usr/bin/bash" on systems where "/usr/bin/env" was not available.
I use "shopt -s extglob # Turn on extended global expressions" in my scripts because this is great to have when I'm doing regular expressions and pattern matching.
I always use "set -o pipefail -o nounset -o errtrace -o functrace" to avoid issues with pipes failing in the middle, fat fingering variable names, and ease of tracing errors and functions. I know of others that often use " shopt -s inherit_errexit nullglob compat" and I can see the utility of these options as well.
All error messages I print out follow a pattern that will let the programmer know where in the code the error was found and reported. echo -e "ERROR [${LINENO}] in ${FUNCNAME[*]}: ..." 1>&2
Consistently using widely accepted conventions and good programming practices can significantly reduce debug time and make your code easily portable and maintainable. For example, Bash doesn't require defining and initializing variables, but it can prevent using uninitialised values and lets users write better code and detect mispelled value names.
Having worked on code that uses all miniscule characters for variables and constants, my experience is that this practice makes it very difficult to clearly see where the variable is being used, and makes it very easy to make mistakes.
I use camel case naming in function names (personal preference, not convention). This makes it clear that I am calling a local function which I've created or sourced into the environment.
Lastly, I recommend using the "source" command, in place of the older '.' character when sourcing in code from another file. If nothing else, finding all the places where I'm sourcing something is much easier with this option.
There are a lot of skills I've learned in my career, far more than are relevant to this topic (yes, I've wandered far afield), but Bash is an incredibly useful and ubiquitous programming tool on *nix systems. Learning to write clear and maintainable code by following the common conventions is a mark of professional growth.
Upvotes: 5
Reputation: 629
Bash, and most shell script interpreters, recognize global and local variables within functions (e.g typeset, declare, local) and should be used as appropriate. As previously commented, "Environment variable names used by the utilities in the Shell and Utilities volume of POSIX.1-2017 consist solely of uppercase letters, digits, and the underscore ( _ ) from the characters defined in Portable Character Set and do not begin with a digit. ... The name space of environment variable names containing lowercase letters is reserved for applications. Applications can define any environment variables with names from this name space without modifying the behavior of the standard utilities." (POSIX IEEE Std 1003.1-2008 section 8.1 )
Upvotes: 1
Reputation: 42999
Any naming conventions followed consistently will always help. Here are a few helpful tips for shell variable naming:
Use all caps and underscores for exported variables and constants, especially when they are shared across multiple scripts or processes. Use a common prefix whenever applicable so that related variables stand out and won't clash with Bash internal variables which are all upper case.
Examples:
JOB_HOME
JOB_LOG
JOB_TEMP
JOB_RUN_CONTROL
LOG_DEBUG
LOG_INFO
LOG_ERROR
STATUS_OK
STATUS_ERROR
STATUS_WARNING
Use "snake case" (all lowercase and underscores) for all variables that are scoped to a single script or a block.
Examples: input_file
first_value
max_amount
num_errors
Use mixed case when local variable has some relationship with an environment variable, like: old_IFS
old_HOME
Use a leading underscore for "private" variables and functions. This is especially relevant if you ever write a shell library where functions within a library file or across files need to share variables, without ever clashing with anything that might be similarly named in the main code.
Examples: _debug
_debug_level
_current_log_file
Avoid camel case. This will minimize the bugs caused by case typos. Remember, shell variables are case sensitive.
Examples: inputArray
thisLooksBAD
, numRecordsProcessed
, veryInconsistent_style
See also:
Upvotes: 80
Reputation: 11983
If shell variables are going to be exported to the environment, it’s worth considering that the POSIX (Issue 7, 2018 edition) Environment Variable Definition specifies:
Environment variable names used by the utilities in the Shell and Utilities volume of POSIX.1-2017 consist solely of uppercase letters, digits, and the underscore (
_
) from the characters defined in Portable Character Set and do not begin with a digit.
...
The name space of environment variable names containing lowercase letters is reserved for applications. Applications can define any environment variables with names from this name space without modifying the behavior of the standard utilities.
Upvotes: 31
Reputation: 125406
By convention, environment variables (PAGER
, EDITOR
, ...) and internal shell variables (SHELL
, BASH_VERSION
, ...) are capitalized. All other variable names should be lower case.
Remember that variable names are case-sensitive; this convention avoids accidentally overriding environmental and internal variables.
Keeping to this convention, you can rest assured that you don't need to know every environment variable used by UNIX tools or shells in order to avoid overwriting them. If it's your variable, lowercase it. If you export it, uppercase it.
Upvotes: 398
Reputation:
Actually, the term "environment variables" seems to be of fairly recent coinage. Kernighan and Pike in their classic book "The UNIX Programming Environment", published in 1984, speak only of "shell variables" - there is not even an entry for "environment" in the index!
Upvotes: 5
Reputation: 62593
i tend use ALL_CAPS both for environment and global variables. of course, in Bash there's no real variable scope, so there's a good portion of variables used as globals (mostly settings and state tracking), and relatively few 'locals' (counters, iterators, partly-constructed strings, and temporaries)
Upvotes: 1
Reputation: 34711
I do what you do. I doubt there's an authoritative source, but it seems a fairly widespread de-facto standard.
Upvotes: 5
Reputation: 339816
It's just a very widely held convention, I doubt there's any "authoritative" source for it.
Upvotes: 0