Reputation: 117
UPDATED: I made a bit of progress but now I am stuck with an error due to a drive name being used twice with two different drive ids. They are different folders. I don't have the authority to change any drive names so I need to figure out what to fix in order to access this drive and it's contents. See below for the update and the error:
ORIGINAL INTRO:
So, I can't really create reproducible code but I do believe nearly all SharePoint sites have a Site Contents page that looks something like this:
https://COMPANY.sharepoint.com/teams/TEAM-NAME/_layouts/15/viewlsts.aspx?view=14
I've been tasked to search ALL content on this SharePoint site for specific keywords. I know how to go to https://COMPANY.sharepoint.com/teams/TEAM-NAME/ and get_drive(), which is a document library called Documents. It contains 9 subfolders (or subdrives). But, there are actually a total of 87 document libraries. Now, maybe I just don't understand the structure of Sharepoint (totally) but I need access to the complete 87 document libraries found in Site Contents.
Here's my code, at which once I get to where I confirm there are the 87 I can't seem to be able to get into them to search the contents. So, this is where I need help. I am sure my issue is my fundamental lack of knowledge of SharePoint and Microsoft365R.
UPDATED CODE:
library(tidyverse)
library(Microsoft365R)
library(stringr)
library(HelpersMG)
library(AzureAuth)
library(httr)
library(pdftools)
library(readtext)
library(jsonlite)
### 1. Authnticate Sharepoint site
site <- get_sharepoint_site(site_url="https://COMPANY.sharepoint.com/teams/TEAM-NAME/")
### 2. Get Site Information
# Get SharePoint site ID (needed for API calls)
site_id <- site$properties$id
# List all document libraries within the site
drives <- site$list_drives()
drv <- site$get_drive()
### 3. Use Microsoft Graph API to List All Document Libraries
# Define the Microsoft Graph API endpoint for document libraries
url <- paste0("https://graph.microsoft.com/v1.0/sites/", site_id, "/drives")
# Make API request
response <- GET(url, add_headers(Authorization = paste("Bearer", site$token$credentials$access_token)))
# Parse response
content <- content(response, as = "text", encoding = "UTF-8")
data <- fromJSON(content, flatten = TRUE)
libraries_df <- data$value
# Convert to data frame if it's not already
if (!is.data.frame(libraries_df)) {
libraries_df <- as.data.frame(libraries_df, stringsAsFactors = FALSE)
}
# Confirms 87 document libraries
print(libraries_df[, c("id", "name")])
# Display drive names and IDs
for (drv in drives) {
cat("Drive Name:", drv$properties$name, "\n")
cat("Drive ID:", drv$properties$id, "\n\n")
}
### 4. Define a Function to Recursively List All Items in a Drive
list_files_recursive <- function(drive_id, parent_path = "") {
# Retrieve the drive object using its ID
drive <- site$get_drive(drive_id = drive_id)
# Use list_items() to retrieve items in the specified folder
items <- drive$list_items(parent_path)
# Separate files and folders
files <- items[!items$isdir, ]
folders <- items[items$isdir, ]
# Initialize a result with files
all_files <- files
# Recursively process each folder
for (folder in folders$name) {
folder_path <- file.path(parent_path, folder)
all_files <- rbind(all_files, list_files_recursive(drive_id, folder_path))
}
return(all_files)
}
### 5. Iterate Over Each Drive and Collect All Files
# Initialize a list to store all files
all_files <- list()
# Iterate over each drive
for (drv in drives) {
drive_id <- drv$properties$id
drive_name <- drv$properties$name
web_url <- drv$properties$webUrl
# Use the drive_id to fetch files
library_files <- list_files_recursive(drive_id)
# Store the retrieved files in the all_files list
all_files[[paste0(drive_name, " (", substr(drive_id, 1, 8), ")")]] <- library_files
}
And here is the error:
Error: Absolute path incompatible with path starting from item ID
So, I know this error is happening because I have two folders with the same name but they are actually different folders. But, even though I am trying to use the drive ID instead of the name I am still getting this error.
I appreciate any assistance!
Upvotes: 0
Views: 48