user8698371
user8698371

Reputation:

Accept multiple files in parameter using args python

I need to be able to import and manipulate multiple text files in the function parameter. I figured using *args in the function parameter would work, but I get an error about tuples and strings.

def open_file(*filename): 
   file = open(filename,'r')
   text = file.read().strip(punctuation).lower()  
   print(text)

open_file('Strawson.txt','BigData.txt')
ERROR: expected str, bytes or os.PathLike object, not tuple 

How do I do this the right way?

Upvotes: 1

Views: 3175

Answers (1)

PM 2Ring
PM 2Ring

Reputation: 55479

When you use the *args syntax in a function parameter list it allows you to call the function with multiple arguments that will appear as a tuple to your function. So to perform a process on each of those arguments you need to create a loop. Like this:

from string import punctuation

# Make a translation table to delete punctuation
no_punct = dict.fromkeys(map(ord, punctuation))

def open_file(*filenames):
    for filename in filenames:
        print('FILE', filename)
        with open(filename) as file:
            text = file.read()
        text = text.translate(no_punct).lower()
        print(text)
        print()

#test

open_file('Strawson.txt', 'BigData.txt')

I've also included a dictionary no_punct that can be used to remove all punctuation from the text. And I've used a with statement so each file will get closed automatically.


If you want the function to "return" the processed contents of each file, you can't just put return into the loop because that tells the function to exit. You could save the file contents into a list, and return that at the end of the loop. But a better option is to turn the function into a generator. The Python yield keyword makes that simple. Here's an example to get you started.

def open_file(*filenames):
    for filename in filenames:
        print('FILE', filename)
        with open(filename) as file:
            text = file.read()
        text = text.translate(no_punct).lower()
        yield text

def create_tokens(*filenames):
    tokens = [] 
    for text in open_file(*filenames):
        tokens.append(text.split())
    return tokens

files = '1.txt','2.txt','3.txt'
tokens = create_tokens(*files)
print(tokens)

Note that I removed the word.strip(punctuation).lower() stuff from create_tokens: it's not needed because we're already removing all punctuation and folding the text to lower-case inside open_file.

We don't really need two functions here. We can combine everything into one:

def create_tokens(*filenames):
    for filename in filenames:
        #print('FILE', filename)
        with open(filename) as file:
            text = file.read()
        text = text.translate(no_punct).lower()
        yield text.split()

tokens = list(create_tokens('1.txt','2.txt','3.txt'))
print(tokens)

Upvotes: 2

Related Questions