Reputation: 13
I am trying to import multiple files from different folders (in different directories) on my shared drive. But when I use the change directory function, I can only select one path.
Is there any way I can import files from multiple folders? Idea is to import 2-3 different files (.txt or /pdf) and merge them into one output file.
I have been using the following code so far:
pip install PyPDF2
from PyPDF2 import PdfFileMerger
import os
chdir = os.chdir("C:/Users/47124")
merger = PdfFileMerger()
input1 = open("File1", "rb") #what is rb and wb?
input2 = open("File2", "rb")
merger.append(fileobj = input1)
merger.append(fileobj = input2)
output = open("document-output.pdf","wb")
merger.write(output)
output.close()
Note: File1 and File2 are in different locations; They cannot be placed in one folder.
Upvotes: 0
Views: 411
Reputation: 10709
No need to change directories within your python code, just define the filenames as absolute paths (complete path). So instead of e.g. File1.pdf
or MyFolder/File1.pdf
which are relative, change it to absolute e.g. /home/ptklearning/Documents/MyFolder/File1.pdf
or C:\Users\ptklearning\Documents\MyFolder\File1.pdf
or wherever it is.
merger = bytes()
filenames = [
'/home/nponcian/Documents/GitHub/myproject/src/notes/file1.py',
'/home/nponcian/Documents/Program/file2.txt',
# Add more files as you wish. Note that they would be appended as raw bytes.
]
filename_output = "document-output.txt"
for filename in filenames:
with open(filename, 'rb') as file_input:
file_content = file_input.read()
merger += file_content
with open(filename_output, 'wb') as file_output:
file_output.write(merger)
Contents of document-output.txt:
# This is file1.py
# A dummy line within file1.py
While this is file2.txt!
Nothing more from this 2nd file...
file2 wants to say goodbye now!
from PyPDF2 import PdfFileMerger, PdfFileReader
merger = PdfFileMerger()
filenames = [
'/home/nponcian/Documents/Program/StackOverflow_how_to_python.pdf',
'/media/sf_VirtualBoxFiles/python_cheat_sheet.pdf',
# Add more files as you wish. Note that they would be appended as PDF files.
]
filename_output = "document-output.pdf"
for filename in filenames:
merger.append(PdfFileReader(filename, strict=False))
with open(filename_output, 'wb') as file_output:
merger.write(file_output)
Contents of document-output.pdf:
<Combined PDF of the input files>
rb
signifies Read-Binarywb
signifies Write-Binary (if you want
to always overwrite the output file)ab
signifies Append-Binary (if
you want to just append new merges to the output file)You need the explicit b
(binary) if you are processing files that are not the usual text files such as docx, pdf, mp3, etc. Try opening them with a text editor and you would know what I mean :) Such files would be read as a python-bytes
object and not as python-str
objects.
Upvotes: 0