Saniii
Saniii

Reputation: 97

Concatenating files of different directories to one file (Python)

so I have managed to concatenate every single .txt file of one directory into one file with this code:

import os
import glob

folder_path = "/Users/EnronSpam/enron1/ham"
for filename in glob.glob(os.path.join(folder_path, '*.txt')):
  with open(filename, 'r', encoding="latin-1") as f:
      text = f.read()
      with open('new.txt', 'a') as a:
            a.write(text)

but in my 'EnronSpam' folder there are actually multiple directories (enron 1-6), each of which has a ham directory. How is it possible to go through each directory and add every single file of that directory into one file?

Upvotes: 0

Views: 628

Answers (2)

Szabolcs
Szabolcs

Reputation: 4106

If you just want to collect all the txt files from the enron[1-6]/ham folders try this:

glob.glob("/Users/EnronSpam/enron[1-6]/ham/*.txt")

It will pick up all txt files from the enron[1-6] folders' ham subfolders.

Also a slightly reworked snippet of the original code looks like this:

import glob

glob_path = "/Users/EnronSpam/enron[1-6]/ham/*.txt"
with open("new.txt", "w") as a:
    for filename in glob.glob(glob_path):
        with open(filename, "r", encoding="latin-1") as f:
            a.write(f.read())

Instead of always opening and appending to the new file it makes more sense to open it right at the beginning and write the content of the ham txt files.

Upvotes: 1

VVelev
VVelev

Reputation: 11

So, given that the count and the names of the directories are known, you should just add the full paths in a list and loop execute it all for each element:

import os
import glob

folder_list = ["/Users/EnronSpam/enron1/ham", "/Users/EnronSpam/enron2/ham", "/Users/EnronSpam/enron3/ham"]
for folder in folder_list:
    for filename in glob.glob(os.path.join(folder, '*.txt')):
      with open(filename, 'r', encoding="latin-1") as f:
          text = f.read()
          with open('new.txt', 'a') as a:
                a.write(text)

Upvotes: 0

Related Questions