arudeyeti
arudeyeti

Reputation: 43

How to insert zeros within file names to make them the same length

I want to make a large batch of JPG files all have the same length file name. Then to convert them to PDF. Ex. 65-1A, 66-10B, 72-108C _> 65-001A, 66-010B, 72-108C. XX-XXXX is the goal length, if a file has a shorter name then insert the correct amount of 0's after the dash to reach the goal length.

I tried parsing file names but I want to make the file names the same length first

def parseFilename(file):
    baseFileName = os.path.splitext(file)[0]
    parts = baseFileName.split('-')
    year = parts[0]
    sequence = 0
    permitNumber = 0
    pageNumber = 0

    if len(parts) > 1:

        if len(parts[1]) == 2:
            permitNumber = (parts[1])[0:1]
            pageNumber = (parts[1])[1:2]

        if len(parts[1]) == 3:
            permitNumber = (parts[1])[0:2]
            pageNumber = (parts[1])[2:3]

        if len(parts[1]) == 4:
            permitNumber = (parts[1])[0:2]
            pageNumber = (parts[1])[2:3]
            sequence = (parts[1])[3:4]

    permitNumber = '{0}_{1}'.format(year, permitNumber)
    return (permitNumber, pageNumber)

Since the file names are all different lengths it is having trouble parsing them correctly. I believe making them all have the same naming format might be easier.

Upvotes: 3

Views: 1278

Answers (2)

dbort
dbort

Reputation: 1004

str.format()'s padding can help here. For example,

>>> '{:0>4}'.format('1A')
'001A'

The format :0>4 says:

  • :: indicates that this is a format specifier
  • 0: pad with zero
  • >: right-align
  • 4: pad up to four characters

We could change a couple of these to pad with a different character and a different width:

>>> '{:#>6}'.format('1A')
'####1A'

Here's an example function that does the padding. As a bonus, it handles missing or extra hyphenated fields.

def PadName(orig):
  # Split into fields.
  parts = orig.split('-')
  # Make sure there are enough fields.
  while len(parts) < 2:
    # This added field will become all zeros.
    parts.append('')
  # Pad the first two fields appropriately; leave the rest alone.
  parts[0] = '{:0>2}'.format(parts[0])
  parts[1] = '{:0>4}'.format(parts[1])
  # Join the fields back together.
  return '-'.join(parts)


testNames = (
  '72-108C',  # Already formatted
  '5-1A74',  # First part short
  '65-1A',  # Second part short
  '2-10C',  # Both parts short
  '5A1-1A74',  # First part long
  '5A-1A74C',  # Second part long
  '15-4-AA',  # Extra field
  'B',  # Only one field
  '',  # Empty
)


for baseFileName in testNames:
  padded = PadName(baseFileName)
  print('{:10}  >  {:10}'.format(baseFileName, padded))

Output:

72-108C     >  72-108C   
5-1A74      >  05-1A74   
65-1A       >  65-001A   
2-10C       >  02-010C   
5A1-1A74    >  5A1-1A74  
5A-1A74C    >  5A-1A74C  
15-4-AA     >  15-0004-AA
B           >  0B-0000   
            >  00-0000  

Upvotes: 4

Prune
Prune

Reputation: 77857

Split the name at the hyphen. Pad the second part as needed. Rejoin the parts. The step you're missing is computing how many leading 0's you need: that's four minutes the length of that filename part.

tags = ["65-1A", "66-10B", "72-108C"]

for f in tags:
    parts = f.split('-')
    new_part1 = '0' * (4-len(parts[1])) + parts[1]
    new_part = '-'.join([parts[0], new_part1])
    print(new_part)

Output:

65-001A
66-010B
72-108C

Upvotes: 1

Related Questions