Reputation: 9815
Here's a snippet:
var=`ls | shuf | head -2 | xargs cat | sed -e 's/\(.\)/\1\n/g' | shuf | tr -d '\n'`
This will select two random files from the current directory, combine their contents, shuffle them, and assign the result to var. This works fine most of the time, but about once in a thousand cases, instead just the output of ls
is bound to var (It's not just the output, see EDIT II). What could be the explanation?
Some more potentially relevant facts:
GNU bash, version 4.1.5(1)-release (i686-pc-linux-gnu)
Linux 2.6.35-28-generic-pae #50-Ubuntu
EDIT: I ran the snippet by itself a couple of thousand times with no errors. Then I tried running it with various other parts of the whole script. Here's a configuration that produces errors:
cd dir_with_text_files
var=`ls | shuf | head -2 | xargs cat | sed -e 's/\(.\)/\1\n/g' | shuf | tr -d '\n'`
cd ..
There are several hundred lines of the script between the cd
s, but this is the minimal configuration to reproduce the error. Note that the anomalous output binds to var the output of the current directory, not dir_with_text_files
.
EDIT II: I've been looking at the outputs in more detail. The ls
output doesn't appear alone, it's along with with two shuffled files (between their contents, or after or before them, intact). But it gets better; let me set up the stage to talk about particular directories.
[~/projects/upload] ls -1
checked // dir
lines // dir, the files to shuffle are here
pages // also dir
proxycheck
singlepost
uploader
indexrefresh
t
tester
So far, I've seen the output of ls
ran from upload
, but now I saw the output of ls */*
(also ran from upload
). It was in the form of "someMangledText ls
moreMangledText ls */*
finalBatchOfText". Is it possible that the sequence ls
that undoubtedly was generated was somehow executed?
Upvotes: 14
Views: 977
Reputation: 11
For debugging purposes you may also clear the environment using env -i
and filter out non-printable characters:
#!/usr/bin/env -i /bin/bash --
set -ef
set -o pipefail
unset IFS PATH LC_ALL
IFS=$' \t\n'
PATH="$(PATH=/bin:/usr/bin getconf PATH)"
LC_ALL=C
export IFS PATH LC_ALL
#var="$((find . -type f -maxdepth 1 -print0 | shuf -z -n 2 | xargs -0 cat) | sed -e 's/\(.\)/\1\n/g' | shuf | tr -d '\n')"
var="$((find . -type f -maxdepth 1 -print0 | shuf -z -n 2 | xargs -0 cat) | tr -cd '[[:print:]]' | grep -o '.' | shuf | tr -d '\n')"
Before running the script you may also disable the GNU readline library and ! style history expansion:
bash --noediting
set +H
Upvotes: 1
Reputation: 17674
No problems here either. I would also rewrite the above to this:
sed 's:\(.\):\1\n:g' < <(shuf -e * | head -2 | xargs cat) | shuf | tr -d '\n'
Do not use ls
to list a directory's content, use *
.
Moreover, do some debugging. Use a shebang followed by:
set -e
set -o pipefail
and run the script like this:
/bin/bash -x /path/to/script
and do inspect the output.
Instead of debugging the whole script, you can surround just the part that seems to be problematic with -x
set -x
...code that may have problems...
set +x
so that the output focuses on that part of the code.
Also, use the pipefail
option.
Some definitions:
-e
: Exit immediately if a simple command exits with a non-zero status, unless the command that fails is part of the command list immediately following a while or until keyword, part of the test in an if statement, part of a && or || list, or if the command's return status is being inverted using !. A trap on ERR, if set, is executed before the shell exits-x
: Print a trace of simple commands, for commands, case commands, select commands, and arithmetic for commands and their arguments or associated word lists after they are expanded and before they are executed. The value of the PS4 variable is expanded and the resultant value is printed before the command and its expanded argumentspipefail
: If set, the return value of a pipeline is the value of the last (rightmost) command to exit with a non-zero status, or zero if all commands in the pipeline exit successfully
Upvotes: 2
Reputation: 1787
Based on what you say wrt to your failure rates, and given the success of the other tests performed by the posters above, it sounds like a problem that could be caused by an occasional directory-change failure. Is the directory you're accessing in this script mounted from a remote machine by chance? If so, it might just be a small and temporary network-related failure that's not being handled properly. (Just a guess.)
Upvotes: 0