Exodos
Exodos

Reputation: 43

Stripping the leading zeros but leave a single 0

So let me start of by saying that I'm new to bash so I would appreciate a simple explanation on the answers you give.

I've got the following block of code:

name="Chapter 0000 (sub s2).cbz "

s=$(echo $name | grep -Eo '[0-9]+([.][0-9]+)?' | tr '\n' ' ' | sed 's/^0*//')

echo $s

readarray -d " " -t myarr <<< "$s"

if [[ $(echo "${myarr[0]} < 100 && ${myarr[0]} >= 10" | bc) -ne 0 ]]; then
    myarr[0]="0${myarr[0]}"
elif [[ $(echo "${myarr[0]} < 10" | bc) -ne 0 ]]; then
    myarr[0]="00${myarr[0]}"
fi

newName="Chapter ${myarr[0]}.cbz"

echo $newName

which (in this case) would end up spitting out:

 2
(standard_in) 1: syntax error
(standard_in) 1: syntax error
Chapter .cbz

(I'm fairly certain that the syntax errors are because ${myarr[0]} is null when doing the comparisons)

This is not the output I want. I want the code to strip leading 0's but leave a single 0 if its all 0.

So the code to really change would be sed 's/^0*//') but I'm not sure how to change it.

(expected outputs:

              in   --->   out
1) chapter 8.cbz   ---> Chapter 008.cbz
2) chapter 1.3.cbz   ---> Chapter 001.3.cbz
3) _23 (sec 2).cbz   ---> Chapter 023.cbz
4) chapter 00009.cbz   ---> Chapter 009.cbz
5) chap 0000112.5.cbz   ---> Chapter 112.5.cbz
6) Chapter 0000 (sub s2).cbz   ---> Chapter 000.cbz

so far the code I got works for 1- 3 but not the leading 0 cases (4-6))

Upvotes: 0

Views: 311

Answers (6)

Fravadona
Fravadona

Reputation: 17300

In pure bash:

#!/bin/bash

for name in \
    'chapter 8.cbz' \
    'chapter 1.3.cbz' \
    '_23 (sec 2).cbz' \
    'chapter 00009.cbz' \
    'chap 0000112.5.cbz' \
    'Chapter 0000 (sub s2).cbz' \
    '_23.2 (sec 2).cbz'
do

##### The relevant part #####

[[ $name =~ ^[^0-9]*([0-9]+)([0-9.]*)[^.]*(\..*)$ ]]

chapter=$(( 10#${BASH_REMATCH[1]} ))
suffix="${BASH_REMATCH[2]}${BASH_REMATCH[3]}"

newName=$(printf 'Chapter %03d%s' "$chapter" "$suffix")

#############################

echo "$newName"

done
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
Chapter 000.cbz
Chapter 023.2.cbz

notes:

  • [[ =~ ]] is the way to use an ERE regex in bash. The one that I wrote has two capture groups: one for capturing the first appearing sequence of digits (which should be the chapter number), and one for capturing all the characters that appear after the first dot (included).
  • $(( 10#... )) converts a zero prefixed decimal to a normal decimal. This is needed because a number that starts with 0 would mean that's an octal instead of a decimal.
  • printf '%03d' converts a number to a decimal of at least 3 digits, padding the left with zeros when it's not the case.

Upvotes: 2

jhnc
jhnc

Reputation: 16908

I think you could implement the table of results by sed alone:

sed '
    s/^[^0-9]*/000/
    s/[^0-9.].*$//
    s/\.*$/.cbz/
    s/^0*\([0-9]\{3\}\)/Chapter \1/
' <<'EOD'
chapter 8.cbz
chapter 1.3.cbz
_23 (sec 2).cbz
chapter 00009.cbz
chap 0000112.5.cbz
chap 04567.cbz
EOD
  • The first command strips everything before the first number and prepends zeros to ensure there are at least three digits.
  • The second command strips off everything after the number. (This may leave a trailing period that is not part of the number as the code defines a number to be any sequence of digits and periods).
  • The third command deletes any trailing periods and adds the desired suffix.
  • The final command removes the longest run of leading zeroes that leaves (at least) three digits (I added an extra test case to demonstrate) and adds the desired prefix.

Result of running this would be:

Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz
Chapter 4567.cbz

Upvotes: 1

potong
potong

Reputation: 58578

This might work for you (GNU sed):

sed -E 's/\b0+(0\.)?/\1/' file

Remove leading zeroes but leave an optional zero.

Upvotes: 0

Dudi Boy
Dudi Boy

Reputation: 4900

Another 1 liner sed command:

Testing file input.1.txt

1) chapter 8.cbz   
2) chapter 1.3.cbz 
3) _23 (sec 2).cbz 
4) chapter 00009.cbz
5) chap 0000112.5.cbz

sed command

sed -E '{s/(^[^ ]*)([^[:digit:]]+)([[:digit:]]+[\. ]?[[:digit:]]*)([\. ].*$)/000\3/;s/([[:digit:]]+)([[:digit:]]{3})(.*$)/Chapter \2\3.cbz/}' input.1.txt

output

Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz

Upvotes: 1

sseLtaH
sseLtaH

Reputation: 11247

Using sed

$ sed 's/[^0-9]*0\+\?\([0-9]\{1,\}\)[^.]*\(\..*\)/Chapter 00\1\2/;s/0\+\([0-9]\{3,\}\)/\1/' file
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz

s/[^0-9]*0\+\?\([0-9]\{1,\}\)[^.]*\(\..*\)/Chapter 00\1\2/ - Strip everything up to a digit that is not zero, then add Chapter at the beginning as well as 2 zero after stripping the initial zeros.

s/0\+\([0-9]\{3,\}\)/\1/ - Once again, strip excess zeros ensuring only three digits before the period remain.

Upvotes: 2

Dudi Boy
Dudi Boy

Reputation: 4900

Here is an awk script that does the trick:

script.awk

{
  str = "000" gensub("(^[[:digit:]]+\\.?[[:digit:]]*)( \\([^)]+\\))?(\\.cbz)", "\\1", "g", RT);
  str = gensub("(^[[:digit:]]+)([[:digit:]]{3})(.*$)", "\\2\\3", "g", str);
  printf("Chapter %s.cbz\n", str);
}

Test input.1.txt

1) chapter 8.cbz   
2) chapter 1.3.cbz 
3) _23 (sec 2).cbz 
4) chapter 00009.cbz
5) chap 0000112.5.cbz

Output:

awk -f script.awk RS='[[:digit:]]+[\\.]?[[:digit:]]*( \\([^)]+\\))?\\.cbz' input.1.txt
Chapter 008.cbz
Chapter 001.3.cbz
Chapter 023.cbz
Chapter 009.cbz
Chapter 112.5.cbz

Upvotes: 1

Related Questions