Reputation: 11
I'm new to Bash scripting. I have a requirement to convert multiple input files in UTF-8 encoding to ISO 8859-1.
I am using the below command, which is working fine for the conversion part:
cd ${DIR_INPUT}/
for f in *.txt; do iconv -f UTF-8 -t ISO-8859-1 $f > ${DIR_LIST}/$f; done
However, when I don't have any text files in my input directory ($DIR_INPUT
), it still creates an empty .txt
file in my output directory ($DIR_LIST
).
How can I prevent this from happening?
Upvotes: 1
Views: 1053
Reputation: 21502
As @ghoti pointed out, in the absence of files matching the wildcard expression a*
the expression itself becomes the result of pathname expansion. By default (when nullglob
option is unset), a*
is expanded to, literally, a*
.
You can set nullglob
option, of course. But then you should be aware of the fact that all subsequent pathname expansions will be affected, unless you unset the option after the loop.
I would rather use find
command which has a clear interface (and, in my opinion, is less likely to perform implicit conversions as opposed to the Bash globbing). E.g.:
cmd='iconv --verbose -f UTF-8 -t ISO-8859-1 "$0" > "$1"/$(basename "$0")'
find "${DIR_INPUT}/" \
-mindepth 1 \
-maxdepth 1 \
-type f \
-name '*.txt' \
-exec sh -c "$cmd" {} "${DIR_LIST}" \;
In the example above, $0
and $1
are positional arguments for the file path and ${DIR_LIST}
respectively. The command is invoked via standard shell (sh
) because of the need to refer to the file path {}
twice. Although most modern implementations of find
may handle multiple occurrences of {}
correctly, the POSIX specification states:
If more than one argument containing the two characters "{}" is present, the behavior is unspecified.
As in the for
loop, the -name
pattern *.txt
is evaluated as true if the basename of the current pathname matches the operand (*.txt
) using the pattern matching notation. But, unlike the for
loop, filename expansion do not apply as this is a matching operation, not an expansion.
Upvotes: 1
Reputation: 46876
The empty file *.txt
is being created in your output directory because by default, bash expands an unmatched expansions to the literal string that you supplied. You can change this behaviour in a number of ways, but what you're probably looking for is shopt -s nullglob
. Observe:
$ for i in a*; do echo "$i"; done
a*
$ shopt -s nullglob
$ for i in a*; do echo "$i"; done
$
You can find documentation about this in the bash man page under Pathname Expansion. Or here or here.
In your case, I'd probably rewrite this in this way:
shopt -s nullglob
for f in "$DIR_INPUT"/*.txt; do
iconv -f UTF-8 -t ISO-8859-1 "$f" > "${DIR_LIST}/${f##*/}"
done
This avoids the need for the initial cd
, and uses parameter expansion to strip off the path portion of $f
for the output redirection. The nullglob
will obviously eliminate the work being done on a nonexistent file.
Upvotes: 1