Pablo
Pablo

Reputation: 45

reading HTML(different folders) files

I want to read HTML files in python. Normaly I do it like this (and it works):

import codecs
f = codecs.open("test.html",'r')
print f.read()

The Problem is that my html files are not all in the same Folder since have a program which generates this html files and save them into folders which are inside the folder where I have my script to read the files. Summarizing, I have my script in a Folder and inside this Folder there are more Folders where the generated html files are.

Does anybody know how can I proceed?

Upvotes: 0

Views: 84

Answers (2)

Saket Mittal
Saket Mittal

Reputation: 3906

use os.walk:

import os,codecs
for root, dirs, files in os.walk("/mydir"):
    for file in files:
        if file.endswith(".html"):
             f = codecs.open(os.path.join(root, file),'r')
             print f.read()

Upvotes: 0

Torxed
Torxed

Reputation: 23500

import os
import codecs

for root, dirs, files in os.walk("./"):
    for name in files:
        abs_path = os.path.normpath(root + '/' + name)
        file_name, file_ext = os.path.splitext(abs_path)
        if file_ext == '.html':
            f = codecs.open(abs_path,'r')
            print f.read()

This will walk through <script dir>/ (./ will get translated to your script-directory) and loop through all files in each sub-directory. It will check if the extension is .html and do the work on each .html file.

You would perhaps define more file endings that are "accepted" (for instance .htm).

Upvotes: 1

Related Questions