Reputation: 19
I have the following subject folder structure:
./sub-CC0006/func
..
./sub-CC0199/func
Within the func folder I have a file called sub-CC0006_ses-core2p2_task-loi3_run-01_events.tsv
. When I tried to put the below code in a loop it did not work. (I tried to first loop to each subject directory and then change the .tsv file names based on the different subject number.)
awk -F"\t" -v OFS="\t" '{
for (i=1;i<=NF;i++) {
if ($i == "NaN") $i="n/a"
}
print $0
}' sub-CC0006_ses-core2p2_task-loi3_run-01_events.tsv > sub-CC0006_ses-core2p2_task-loi3_run-01_events_new.tsv &&
mv sub-CC0006_ses-core2p2_task-loi3_run-01_events_new.tsv sub-CC0006_ses-core2p2_task-loi3_run-01_events.tsv
Here is an extract from one of the files I am trying to manipulate:
onset | response_time |
---|---|
9 | NaN |
12 | 1.4 |
Upvotes: 0
Views: 106
Reputation: 753525
The basic technique for overwriting a file with an edited version of the file uses a generic temporary file name as the intermediary file.
I'm assuming that in the sub-CC0199
directory, the func
subdirectory will contain sub-CC01999_ses-core2p2_task-loi3_run-01_events.tsv
and that any other files in the directory are to be ignored, and similarly for each other directory. The script becomes simpler if you simply want to process all the files (or all the *.tsv
files, or some other pattern match) in each of the func
subdirectories for each of the subjects.
tmpfile=$(mktemp "map.XXXXXX")
trap "rm -f $tmpfile; exit 1" 0 1 2 3 13 15
suffix="_ses-core2p2_task-loi3_run-01_events.tsv"
for directory in sub-CC0???
do
file="$directory/func/$directory$suffix"
if [ -f "$file" ]
then
awk '…' "$file" > "$tmpfile" &&
mv "$tmpfile" "$file"
fi
done
rm -f "$tmpfile" # Remove the temporary
trap 0 # Cancel the 'exit' trap; the script exits with status 0
If you're worried about preserving links (or ownership, or permissions) on the original file, or that the original file might be a symlink you want to preserve, you can use cp "$tmpfile" "$file"; rm -f "$tmpfile"
instead of mv
. It's slightly slower, though — but unless the files are big, probably not measurably slower.
You could generate the temporary file name within the loop; it might be marginally safer to do so if you're worried about malicious actors. The file is new (did not exist before) when created by mktemp
, but after you've moved it, a malicious person could create their own symlink to somewhere sensitive so the script could damage other files unexpectedly. (You could also copy the temporary file over the original without removing the temporary, so the same file is used for each .tsv
file — the options are legion.) You're probably not working in an environment that hostile, though.
The trap
list is for "EXIT" (0) and signals 1 (SIGHUP), 2 (SIGINT), 3 (SIGQUIT), 13 (SIGPIPE) and 15 (SIGTERM). I learned to script when only the numbers worked — and they're compact. If you want to be slightly more modern, you could list the short names of the signals and conditions:
trap "rm -f $tmpfile; exit 1" EXIT HUP INT QUIT PIPE TERM
…
trap EXIT
or (to cancel multiple traps, though it's unnecessary when the script is about to exit):
trap - EXIT HUP INT QUIT PIPE TERM
Upvotes: 1