Hannah Baker
Hannah Baker

Reputation: 31

opening and reading all the files in a directory in python - python beginner

I'd like to read the contents of every file in a folder/directory and then print them at the end (I eventually want to pick out bits and pieces from the individual files and put them in a separate document) So far I have this code

import os
path = 'results/'
fileList = os.listdir(path)
for i in fileList:
    file = open(os.path.join('results/'+ i), 'r')
allLines = file.readlines()
print(allLines)

at the end I dont get any errors but it only prints the contents of the last file in my folder in a series of strings and I want to make sure its reading every file so I can then access the data I want from each file. I've looked online and I cant find where I'm going wrong. Is there any way of making sure the loop is iterating over all my files and reading all of them?

also i get the same result when I use

 file = open(os.path.join('results/',i), 'r')

in the 5th line

Please help I'm so lost Thanks!!

Upvotes: 3

Views: 30147

Answers (4)

Maarten Fabré
Maarten Fabré

Reputation: 7058

  • Separate the different functions of the thing you want to do.
  • Use generators wherever possible. Especially if there are a lot of files or large files

Imports

from pathlib import Path
import sys

Deciding which files to process:

source_dir = Path('results/')

files = source_dir.iterdir()

[Optional] Filter files

For example, if you only need files with extension .ext

files = source_dir.glob('*.ext')

Process files

def process_files(files):
    for file in files:
        with file.open('r') as file_handle :
            for line in file_handle:
                # do your thing
                yield line

Save the lines you want to keep

def save_lines(lines, output_file=sys.std_out):
    for line in lines:
        output_file.write(line)

Upvotes: 6

Smich
Smich

Reputation: 435

You forgot to indent this line allLines.append(file.read()). Because it was outside the loop, it only appended the file variable to the list after the for loop was finished. So it only appended the last value of the file variable that remained after the loop. Also, you should not use readlines() in this way. Just use read() instead;

import os

allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
   file = open(os.path.join('results/'+ i), 'r')
   allLines.append(file.read())
print(allLines)

Upvotes: 1

Nicola Amadio
Nicola Amadio

Reputation: 159

This also creates a file containing all the files you wanted to print.

rootdir= your folder, like 'C:\\Users\\you\\folder\\'
import os
f = open('final_file.txt', 'a')
for root, dirs, files in os.walk(rootdir):  
    for filename in files:
        data = open(full_name).read()
            f.write(data + "\n")                 
f.close()

This is a similar case, with more features: Copying selected lines from files in different directories to another file

Upvotes: 0

Rayane Bouslimi
Rayane Bouslimi

Reputation: 193

you forgot indentation at this line allLines = file.readlines() and maybe you can try that :

import os

allLines = []
path = 'results/'
fileList = os.listdir(path)
for file in fileList:
   file = open(os.path.join('results/'+ i), 'r')
   allLines.append(file.read())
print(allLines)

Upvotes: 1

Related Questions