kristof
kristof

Reputation: 11

taking data from files which are in folder

How do I get the data from multiple txt files that placed in a specific folder. I started with this could not fix. It gives an error like 'No such file or directory: '.idea' (??) (Let's say I have an A folder and in that, there are x.txt, y.txt, z.txt and so on. I am trying to get and print the information from all the files x,y,z)

def find_get(folder):
    for file in os.listdir(folder):
        f = open(file, 'r')
        for data in open(file, 'r'):
            print data

find_get('filex')

Thanks.

Upvotes: 1

Views: 504

Answers (5)

Padraic Cunningham
Padraic Cunningham

Reputation: 180411

If you just want to print each line:

import glob
import os

def find_get(path):
    for f in glob.glob(os.path.join(path,"*.txt")):
        with open(os.path.join(path, f)) as data:
            for line in data:
                print(line)

glob will find only your .txt files in the specified path.

Your error comes from not joining the path to the filename, unless the file was in the same directory you were running the code from python would not be able to find the file without the full path. Another issue is you seem to have a directory .idea which would also give you an error when trying to open it as a file. This also presumes you actually have permissions to read the files in the directory.

If your files were larger I would avoid reading all into memory and/or storing the full content.

Upvotes: 2

abarnert
abarnert

Reputation: 365717

The error you're facing is simple: listdir returns filenames, not full pathnames. To turn them into pathnames you can access from your current working directory, you have to join them to the directory path:

for filename in os.listdir(directory):
    pathname = os.path.join(directory, filename)
    with open(pathname) as f:
        # do stuff

So, in your case, there's a file named .idea in the folder directory, but you're trying to open a file named .idea in the current working directory, and there is no such file.

There are at least four other potential problems with your code that you also need to think about and possibly fix after this one:

  • You don't handle errors. There are many very common reasons you may not be able to open and read a file--it may be a directory, you may not have read access, it may be exclusively locked, it may have been moved since your listdir, etc. And those aren't logic errors in your code or user errors in specifying the wrong directory, they're part of the normal flow of events, so your code should handle them, not just die. Which means you need a try statement.
  • You don't do anything with the files but print out every line. Basically, this is like running cat folder/* from the shell. Is that what you want? If not, you have to figure out what you want and write the corresponding code.
  • You open the same file twice in a row, without closing in between. At best this is wasteful, at worst it will mean your code doesn't run on any system where opens are exclusive by default. (Are there such systems? Unless you know the answer to that is "no", you should assume there are.)
  • You don't close your files. Sure, the garbage collector will get to them eventually--and if you're using CPython and know how it works, you can even prove the maximum number of open file handles that your code can accumulate is fixed and pretty small. But why rely on that? Just use a with statement, or call close.

However, none of those problems are related to your current error. So, while you have to fix them too, don't expect fixing one of them to make the first problem go away.

Upvotes: 1

Reut Sharabani
Reut Sharabani

Reputation: 31339

First of all make sure you add the folder name to the file name, so you can find the file relative to where the script is executed.

To do so you want to use os.path.join, which as it's name suggests - joins paths. So, using a generator:

def find_get(folder):
    for filename in os.listdir(folder):
        relative_file_path = os.path.join(folder, filename)
        with open(relative_file_path) as f:
            # read() gives the entire data from the file
            yield f.read()

# this consumes the generator to a list
files_data = list(find_get('filex'))

See what we got in the list that consumed the generator:

print files_data

It may be more convenient to produce tuples which can be used to construct a dict:

def find_get(folder):
    for filename in os.listdir(folder):
        relative_file_path = os.path.join(folder, filename)
        with open(relative_file_path) as f:
            # read() gives the entire data from the file
            yield (relative_file_path, f.read(), )

# this consumes the generator to a list
files_data = dict(find_get('filex'))

You will now have a mapping from the file's name to it's content.

Also, take a look at the answer by @Padraic Cunningham . He brought up the glob module which is suitable in this case.

Upvotes: 1

Reishin
Reishin

Reputation: 1954

Full variant:

import os

def find_get(path):
  files = {}
  for file in os.listdir(path):
    if os.path.isfile(os.path.join(path,file)):
      with open(os.path.join(path,file), "r") as data:
        files[file] = data.read()
  return files

print(find_get("filex"))

Output:

{'1.txt': 'dsad', '2.txt': 'fsdfs'}

After the you could generate one file from that content, etc.

Key-thing:

  • os.listdir return a list of files without full path, so you need to concatenate initial path with fount item to operate.
  • there could be ideally used dicts :)
  • os.listdir return files and folders, so you need to check if list item is really file

Upvotes: 0

bosnjak
bosnjak

Reputation: 8614

You should check if the file is actually file and not a folder, since you can't open folders for reading. Also, you can't just open a relative path file, since it is under a folder, so you should get the correct path with os.path.join. Check below:

import os
def find_get(folder):
    for file in os.listdir(folder):
        if not os.path.isfile(file):
            continue  # skip other directories
        f = open(os.path.join(folder, file), 'r')
        for line in f:
            print line

Upvotes: -1

Related Questions