PTKlearning
PTKlearning

Reputation: 13

How to import files from different locations on server using python?

I am trying to import multiple files from different folders (in different directories) on my shared drive. But when I use the change directory function, I can only select one path.

Is there any way I can import files from multiple folders? Idea is to import 2-3 different files (.txt or /pdf) and merge them into one output file.

I have been using the following code so far:

pip install PyPDF2
from PyPDF2 import PdfFileMerger
import os
chdir = os.chdir("C:/Users/47124")
merger = PdfFileMerger()
input1 = open("File1", "rb") #what is rb and wb?
input2 = open("File2", "rb")
merger.append(fileobj = input1)
merger.append(fileobj = input2)
output = open("document-output.pdf","wb")
merger.write(output)
output.close()

Note: File1 and File2 are in different locations; They cannot be placed in one folder.

Upvotes: 0

Views: 411

Answers (1)

Niel Godfrey P. Ponciano
Niel Godfrey P. Ponciano

Reputation: 10709

No need to change directories within your python code, just define the filenames as absolute paths (complete path). So instead of e.g. File1.pdf or MyFolder/File1.pdf which are relative, change it to absolute e.g. /home/ptklearning/Documents/MyFolder/File1.pdf or C:\Users\ptklearning\Documents\MyFolder\File1.pdf or wherever it is.

Sample code:

merger = bytes()

filenames = [
    '/home/nponcian/Documents/GitHub/myproject/src/notes/file1.py',
    '/home/nponcian/Documents/Program/file2.txt',
    # Add more files as you wish. Note that they would be appended as raw bytes.
]

filename_output = "document-output.txt"

for filename in filenames:
    with open(filename, 'rb') as file_input:
        file_content = file_input.read()
        merger += file_content

with open(filename_output, 'wb') as file_output:
    file_output.write(merger)

Contents of document-output.txt:

# This is file1.py
# A dummy line within file1.py
While this is file2.txt!



Nothing more from this 2nd file...

file2 wants to say goodbye now!

Sample code (using PDF files):

from PyPDF2 import PdfFileMerger, PdfFileReader

merger = PdfFileMerger()

filenames = [
    '/home/nponcian/Documents/Program/StackOverflow_how_to_python.pdf',
    '/media/sf_VirtualBoxFiles/python_cheat_sheet.pdf',
    # Add more files as you wish. Note that they would be appended as PDF files.
]

filename_output = "document-output.pdf"

for filename in filenames:
    merger.append(PdfFileReader(filename, strict=False))

with open(filename_output, 'wb') as file_output:
    merger.write(file_output)

Contents of document-output.pdf:

<Combined PDF of the input files>

Note:

  • rb signifies Read-Binary
  • wb signifies Write-Binary (if you want to always overwrite the output file)
  • ab signifies Append-Binary (if you want to just append new merges to the output file)

You need the explicit b (binary) if you are processing files that are not the usual text files such as docx, pdf, mp3, etc. Try opening them with a text editor and you would know what I mean :) Such files would be read as a python-bytes object and not as python-str objects.

Upvotes: 0

Related Questions