Reputation: 237
I am still new to python and I am making a simples application which is to extract text from ppt files.
I have this project structure.
> Project_Python
>> Files
>>> Class A
- History.ppt
>>> Class B
- Animals.ppt
>> Result
???
- main.py
My question is how can I read the files inside sub_folder of Class A
and Class B?
And I want it to automatically create the folder structure of Files
inside Result
after print
This is what i've tried
from pptx import Presentation
import glob
import pathlib
import os
p_temp = pathlib.Path('Files') //How can I read sub folders folder dynamically
for eachfile in glob.glob("**/*.pptx"):
prs = Presentation(eachfile)
print(eachfile)
print("----------------------")
textdata = []
for slide in prs.slides:
for shape in slide.shapes:
if hasattr(shape, "text"):
textdata.append(shape.text)
print(''.join(textdata[1:]) , file=open("Result/"+eachfile+".txt" , "a")) //Create the same folder structure of Files
Upvotes: 0
Views: 337
Reputation: 803
Your code is almost correct except usage of glob.glob.
You should also pass recursive=True parameter
To create directory with subdirs you can use os.makedirs
from pptx import Presentation
import glob
import pathlib
import os
p_temp = pathlib.Path('Files') //How can I read sub folders folder dynamically
for eachfile in glob.glob(p_temp+"**/*.pptx", recursive=True):
prs = Presentation(eachfile)
print(eachfile)
print("----------------------")
textdata = []
for slide in prs.slides:
for shape in slide.shapes:
if hasattr(shape, "text"):
textdata.append(shape.text)
os.makedirs(str(pathlib.Path(eachfile).parent).replace('Files','Result')
print(''.join(textdata[1:]) , file=open("Result/"+eachfile+".txt" , "a")) //Create the same folder structure of Files
Upvotes: 1