Viont
Viont

Reputation: 43

Retrieving list of .xlsx files from Sharepoint folder with python

Good afternoon everybody!

As our company has several folders with quite a few .xlsx and .xlsb files on Sharepoint, I'm trying right now to write a script to automatically retrieve the list of these files from there and either save them locally or read them directly into pandas dataframes.

The folder structure is as follows:

https://company.sharepoint.com/sites/ProjectSite/Shared Documents/General/Year1

In total there are several Year-folders (Year1, Year2, Year3 etc.), where all .xlsx files are placed.

I tried following code (using the relative url to a folder):

ctx_auth = AuthenticationContext(FULL_URL)
ctx_auth.acquire_token_for_user(USERNAME, PASSWORD)
ctx = ClientContext(SITE_URL, ctx_auth)
lists = ctx.web.get_folder_by_server_relative_url("Shared Documents/General/2019/")
items = lists.folders

ctx.load(items)
ctx.execute_query()

for item in items:
    print(items)
    print("File name: {0}".format(item.properties["Name"]))

FULL_URL is the url with a full path to a Year folder: "https://company.sharepoint.com/sites/ProjectSite/Shared%20Documents/General/Year1/"

SITE_URL:

"https://company.sharepoint.com/sites/ProjectSite/"

But when I execute the code, nothing happens, just literally nothing. I assume, that I got smth wrong with the paths/urls, and most probably refer to false URL, but I don't really know, how to correct or which is the correct one.

Question: In how far the code above needs to be amended in order to list the files are in the folders Year1, Year2, Year3 and so on, so that I can read the list and either save or process them further?

Upvotes: 1

Views: 2718

Answers (1)

Viont
Viont

Reputation: 43

EDIT: It seems, I was blind enough to not see, that I'm trying to load folders instead of files, so after I amended the code to:

ctx_auth = AuthenticationContext(FULL_URL)
ctx_auth.acquire_token_for_user(USERNAME, PASSWORD)
ctx = ClientContext(SITE_URL, ctx_auth)
lists = ctx.web.get_folder_by_server_relative_url("Shared Documents/General/2019/")
items = lists.files

ctx.load(items)
ctx.execute_query()

for item in items:
    print(items)
    print("File name: {0}".format(item.properties["Name"]))

the files get printed.

Upvotes: 1

Related Questions