Reputation: 43
So let me start of by saying that I'm new to bash so I would appreciate a simple explanation on the answers you give.
I've got the following block of code:
name="Chapter 0000 (sub s2).cbz "
s=$(echo $name | grep -Eo '[0-9]+([.][0-9]+)?' | tr '\n' ' ' | sed 's/^0*//')
echo $s
readarray -d " " -t myarr <<< "$s"
if [[ $(echo "${myarr[0]} < 100 && ${myarr[0]} >= 10" | bc) -ne 0 ]]; then
myarr[0]="0${myarr[0]}"
elif [[ $(echo "${myarr[0]} < 10" | bc) -ne 0 ]]; then
myarr[0]="00${myarr[0]}"
fi
newName="Chapter ${myarr[0]}.cbz"
echo $newName
which (in this case) would end up spitting out:
2
(standard_in) 1: syntax error
(standard_in) 1: syntax error
Chapter .cbz
(I'm fairly certain that the syntax errors are because ${myarr[0]}
is null when doing the comparisons)
This is not the output I want. I want the code to strip leading 0's but leave a single 0 if its all 0.
So the code to really change would be sed 's/^0*//')
but I'm not sure how to change it.
(expected outputs:
in ---> out
1) chapter 8.cbz ---> Chapter 008.cbz
2) chapter 1.3.cbz ---> Chapter 001.3.cbz
3) _23 (sec 2).cbz ---> Chapter 023.cbz
4) chapter 00009.cbz ---> Chapter 009.cbz
5) chap 0000112.5.cbz ---> Chapter 112.5.cbz
6) Chapter 0000 (sub s2).cbz ---> Chapter 000.cbz
so far the code I got works for 1- 3 but not the leading 0 cases (4-6))
Upvotes: 0
Views: 311
Reputation: 17300
In pure bash:
#!/bin/bash
for name in \
'chapter 8.cbz' \
'chapter 1.3.cbz' \
'_23 (sec 2).cbz' \
'chapter 00009.cbz' \
'chap 0000112.5.cbz' \
'Chapter 0000 (sub s2).cbz' \
'_23.2 (sec 2).cbz'
do
##### The relevant part #####
[[ $name =~ ^[^0-9]*([0-9]+)([0-9.]*)[^.]*(\..*)$ ]]
chapter=$(( 10#${BASH_REMATCH[1]} ))
suffix="${BASH_REMATCH[2]}${BASH_REMATCH[3]}"
newName=$(printf 'Chapter %03d%s' "$chapter" "$suffix")
#############################
echo "$newName"
done
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
Chapter 000.cbz
Chapter 023.2.cbz
notes:
[[ =~ ]]
is the way to use an ERE regex in bash. The one that I wrote has two capture groups: one for capturing the first appearing sequence of digits (which should be the chapter number), and one for capturing all the characters that appear after the first dot (included).$(( 10#... ))
converts a zero prefixed decimal to a normal decimal. This is needed because a number that starts with 0
would mean that's an octal instead of a decimal.printf '%03d'
converts a number to a decimal of at least 3 digits, padding the left with zeros when it's not the case.Upvotes: 2
Reputation: 16908
I think you could implement the table of results by sed alone:
sed '
s/^[^0-9]*/000/
s/[^0-9.].*$//
s/\.*$/.cbz/
s/^0*\([0-9]\{3\}\)/Chapter \1/
' <<'EOD'
chapter 8.cbz
chapter 1.3.cbz
_23 (sec 2).cbz
chapter 00009.cbz
chap 0000112.5.cbz
chap 04567.cbz
EOD
Result of running this would be:
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
Chapter 4567.cbz
Upvotes: 1
Reputation: 58578
This might work for you (GNU sed):
sed -E 's/\b0+(0\.)?/\1/' file
Remove leading zeroes but leave an optional zero.
Upvotes: 0
Reputation: 4900
Another 1 liner sed
command:
1) chapter 8.cbz
2) chapter 1.3.cbz
3) _23 (sec 2).cbz
4) chapter 00009.cbz
5) chap 0000112.5.cbz
sed -E '{s/(^[^ ]*)([^[:digit:]]+)([[:digit:]]+[\. ]?[[:digit:]]*)([\. ].*$)/000\3/;s/([[:digit:]]+)([[:digit:]]{3})(.*$)/Chapter \2\3.cbz/}' input.1.txt
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
Upvotes: 1
Reputation: 11247
Using sed
$ sed 's/[^0-9]*0\+\?\([0-9]\{1,\}\)[^.]*\(\..*\)/Chapter 00\1\2/;s/0\+\([0-9]\{3,\}\)/\1/' file
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
s/[^0-9]*0\+\?\([0-9]\{1,\}\)[^.]*\(\..*\)/Chapter 00\1\2/
- Strip everything up to a digit that is not zero, then add Chapter
at the beginning as well as 2 zero after stripping the initial zeros.
s/0\+\([0-9]\{3,\}\)/\1/
- Once again, strip excess zeros ensuring only three digits before the period remain.
Upvotes: 2
Reputation: 4900
Here is an awk
script that does the trick:
{
str = "000" gensub("(^[[:digit:]]+\\.?[[:digit:]]*)( \\([^)]+\\))?(\\.cbz)", "\\1", "g", RT);
str = gensub("(^[[:digit:]]+)([[:digit:]]{3})(.*$)", "\\2\\3", "g", str);
printf("Chapter %s.cbz\n", str);
}
1) chapter 8.cbz
2) chapter 1.3.cbz
3) _23 (sec 2).cbz
4) chapter 00009.cbz
5) chap 0000112.5.cbz
awk -f script.awk RS='[[:digit:]]+[\\.]?[[:digit:]]*( \\([^)]+\\))?\\.cbz' input.1.txt
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
Upvotes: 1