Hossein
Hossein

Reputation: 41831

Problem reading text files without extensions in python

I have written a piece of a code which is supposed to read the texts inside several files which are located in a directory. These files are basically text files but they do not have any extensions.But my code is not able to read them:

corpus_path = 'Reviews/'

for infile in glob.glob(os.path.join(corpus_path,'*.*')):
    review_file = open(infile,'r').read()
    print review_file

To test if this code works, I put a dummy text file, dummy.txt. which worked because it has extension. But i don't know what should be done so files without the extensions could be read. can someone help me? Thanks

Upvotes: 3

Views: 4584

Answers (4)

eyquem
eyquem

Reputation: 27575

it seems that you need

from os import listdir

from filename in ( fn for fn in listdir(corpus_path) if '.' not in fn):
    # do something

you could write

from os import listdir

for fn in listdir(corpus_path):
    if '.' not in fn:
        # do something

but the former with a generator spares one indentation level

Upvotes: 0

mikej
mikej

Reputation: 66263

Glob patterns don't work the same way as wildcards on the Windows platform. Just use * instead of *.*. i.e. os.path.join(corpus_path,'*'). Note that * will match every file in the directory - if that's not what you want then you can revise the pattern accordingly.

See the glob module documentation for more details.

Upvotes: 6

Simone
Simone

Reputation: 11797

You could search for * instead of *.*, but this will match every file in your directory.

Fundamentally, this means that you will have to handle cases where the file you are opening is not a text file.

Upvotes: 3

Tim Pietzcker
Tim Pietzcker

Reputation: 336148

Just use * instead of *.*.

The latter requires an extension to be present (more precisely, there needs to be a dot in the filename), the former doesn't.

Upvotes: 5

Related Questions