Reputation: 159
I'm looking to replace characters at specific byte offsets.
Here's what is provided: An input file that is simple ASCII text. An array within a Bash shell script, each element of the array is a numerical byte-offset value.
The goal: Take the input file, and at each of the byte-offsets, replace the character there with an asterisk.
So essentially the idea I have in mind is to somehow go through the file, byte-by-byte, and if the current byte-offset being read is a match for an element value from the array of offsets, then replace that byte with an asterisk.
This post seems to indicate that the dd command would be a good candidate for this action, but I can't understand how to perform the replacement multiple times on the input file.
Input file looks like this:
00000
00000
00000
The array of offsets looks this:
offsetsArray=("2" "8" "9" "15")
The output file's desired format looks like this:
0*000
0**00
00*00
Any help you could provide is most appreciated. Thank you!
Upvotes: 3
Views: 2880
Reputation: 16016
Please check my comment about about newline offset. Assuming this is correct (note I have changed your offset array), then I think this should work for you:
#!/bin/bash
read -r -d ''
offsetsArray=("2" "8" "9" "15")
txt="${REPLY}"
for i in "${offsetsArray[@]}"; do
txt="${txt:0:$i-1}*${txt:$i}"
done
printf "%s" "$txt"
Explanation:
read -d ''
reads the whole input (redirected file) in one go into the $REPLY
variable. If you have large files, this can run you out of memory.i
to grab i-1
characters from the beginning of the string, then insert a *
character, then add the remaining bytes from offset i
. This is done with bash parameter expansion. Note that while your offsets are one-based, bash strings use zero-based indexing.In use:
$ ./replacechars.sh < input.txt
0*000
0**00
00*00
$
Caveat:
This is not really a very efficient solution, as it causes the sting containing the whole file to be copied for every offset. If you have large files and/or a large number of offsets, then this will run slowly. If you need something faster, then another language that allows modification of individual characters in a string would be much better.
Upvotes: 4
Reputation: 23374
With the same offset considerations as @DigitalTrauma's superior solution, here's a GNU awk-based alternative. This assumes your file contains no null bytes
(IFS=','; awk -F '' -v RS=$'\0' -v OFS='' -v offsets="${offsetsArray[*]}" \
'BEGIN{split(offsets, o, ",")};{for (k in o) $o[k]="*"; print}' file)
0*000
0**00
00*00
Upvotes: 2
Reputation: 69062
The usage of dd
can be a bit confusing at the time, but it's not that hard:
outfile="test.txt"
# create some test data
echo -n 0123456789abcde > "$outfile"
offsetsArray=("2" "7" "8" "13")
for offset in "${offsetsArray[@]}"; do
dd bs=1 count=1 seek="$offset" conv=notrunc of="$outfile" <<< '*'
done
cat "$outfile"
Important for this example is to use conv=notrunc
, otherwise dd truncates the file to the length of blocks it seeks over. bs=1
specifies that you want to work with blocks of size 1, and seek
specifies the offset to satart writing count
blocks to.
The above produces 01*3456**9abc*e
Upvotes: 3