ch3quers
ch3quers

Reputation: 13

How can I separate the numeric parts of filenames using Python?

I am trying to write a short program that looks through a directory, takes the filenames of image files, and appends them to match the name of their directory and renumbers and sorts them for processing later. So far I can get the name of the folder, and replace a specific part of the filename with it, using the following;

import os

print os.getcwd()
str = os.getcwd()
ext =  str.split("/")[-1]
print ext

separ = os.sep
folder = str
for n in os.listdir(folder):
    print n
    if os.path.isfile(folder + separ + n):
        filename_zero, extension = os.path.splitext(n)
        os.rename(folder + separ + n , folder + separ + filename_zero.replace('image',ext) + extension)

for n in os.listdir(folder):
    print n

What I can't do is get the numeric part on its own. My filenames are of the type storm000045.tiff and never have underscores or dots for me to separate them by. Any advice is appreciated. Thanks in advance!

Upvotes: 1

Views: 193

Answers (4)

Ansuman Bebarta
Ansuman Bebarta

Reputation: 7256

You can use string module translate(). But the problem is the solution will take out all the digits out of string. Solution doesn't include any check where there is letter after digits. If your format is xxxxdddd.ext then should work.

def translate(s, table, [deletechars]): returns a copy of string in which all characters has been translated using table. If deletechars present then it deletes all character present in deletechars.

translate

def maketrans(from, to): creates a table to be used by translate().

maketrans

>>> import string
>>>
>>>
>>> # Create table for translate where from string quals with to string
...
>>> s = string.maketrans('', '')
>>>
>>> # Need to create delete chars (execpt digits)
...
>>> d = s.translate(s, string.digits)
>>>
>>> # We can use d and s for taking out digits from a string
...
>>> x = 'asdffasd23424'
>>> x.translate(s, d)
'23424'
>>> x = 'asdf33433as444'
>>> x.translate(s, d)
'33433444'
>>>

Upvotes: 1

ma6174
ma6174

Reputation: 1

>>> a = "storm000045.tiff"
>>> print a[5:11]
000045

Upvotes: -2

Inbar Rose
Inbar Rose

Reputation: 43447

Use this simple function:

import re
def get_name_and_number(text):
    return re.match(r'(\D+)(\d+).*', text).groups()

Example:

>>> get_name_and_number('storm000045.tiff')
('storm', '000045')

Or this one:

def extract_numbers(text):
    return ''.join([x for x in text if x.isdigit()])

Example:

>>> extract_numbers('storm000045.tiff')
'000045'

Upvotes: 2

falsetru
falsetru

Reputation: 369064

Using re:

>>> import re
>>> re.split('(\d+)', 'torm000045.tiff')
['torm', '000045', '.tiff']
>>> re.split('(\d+)', 'torm000_045.tiff')
['torm', '000', '_', '045', '.tiff']
>>> re.split('(\d+)', 'torm000_045.tiff')[1::2]
['000', '045']

2nd, 4th, 6th elements are number parts.

Upvotes: 1

Related Questions