drbunsen
drbunsen

Reputation: 10709

File naming problem with Python

I am trying to iterate through a number .rtf files and for each file: read the file, perform some operations, and then write new files into a sub-directory as plain text files with the same name as the original file, but with .txt extensions. The problem I am having is with the file naming.

If a file is named foo.rtf, I want the new file in the subdirectory to be foo.txt. here is my code:

import glob
import os
import numpy as np


dir_path = '/Users/me/Desktop/test/'
file_suffix = '*.rtf'
output_dir = os.mkdir('sub_dir')
for item in glob.iglob(dir_path + file_suffix):
    with open(item, "r") as infile:
        reader = infile.readlines()
        matrix = []
        for row in reader:
            row = str(row)
            row = row.split()
            row = [int(value) for value in row]
            matrix.append(row)
        np_matrix = np.array(matrix)
        inv_matrix = np.transpose(np_matrix)
        new_file_name = item.replace('*.rtf', '*.txt') # i think this line is the problem?
        os.chdir(output_dir)
        with open(new_file_name, mode="w") as outfile:
            outfile.write(inv_matrix)

When I run this code, I get a Type Error:

TypeError: coercing to Unicode: need string or buffer, NoneType found

How can I fix my code to write new files into a subdirectory and change the file extensions from .rtf to .txt? Thanks for the help.

Upvotes: 0

Views: 342

Answers (3)

Niklas R
Niklas R

Reputation: 16900

I've never used glob, but here's an alternative way without using a module:
You can easily strip the suffix using

name = name[:name.rfind('.')]

and then add the new suffix:

name = name + '.txt'

Why not using a function ?

def change_suffix(string, new_suffix):
    i = string.rfind('.')
    if i < 0:
        raise ValueError, 'string does not have a suffix'
    if not new_suffix[0] == '.':
        new_suffix += '.'
    return string[:i] + new_suffix

Upvotes: 2

Remi
Remi

Reputation: 21175

glob.iglob() yields pathnames, without the character '*'. therefore your line should be:

new_file_name = item.replace('.rtf', '.txt') 

consider working with clearer names (reserve 'filename' for a file name and use 'path' for a complete path to a file; use 'path_original' instead of 'item'), os.extsep ('.' in Windows) and os.path.splitext():

path_txt = os.extsep.join([os.path.splitext(path_original)[0], 'txt'])

now the best hint of all: numpy can probably read your file directly:

data = np.genfromtxt(filename, unpack=True)

(see also here)

To better understand where your TypeError comes from, wrap your code in the following try/except block:

try:
    (your code)
except:
    import traceback
    traceback.print_exc()

Upvotes: 0

Brent Writes Code
Brent Writes Code

Reputation: 19623

Instead of item.replace, check out some of the functions in the os.path module (http://docs.python.org/library/os.path.html). They're made for splitting up and recombining parts of filenames. For instance, os.path.splitext will split a filename into a file path and a file extension.

Let's say you have a file /tmp/foo.rtf and you want to move it to /tmp/foo.txt:

old_file = '/tmp/foo.rtf'
(file,ext) = os.path.splitext(old_file)
print 'File=%s Extension=%s' % (file,ext)
new_file = '%s%s' % (file,'.txt')
print 'New file = %s' % (new_file)

Or if you want the one line version:

old_file = '/tmp/foo.rtf'
new_file = '%s%s' % (os.path.splitext(old_file)[0],'.txt')

Upvotes: 3

Related Questions