JeniFav
JeniFav

Reputation: 117

UPDATED: Microsoft365R and SharePoint - Can you search all document libraries found in the Site Contents page?

UPDATED: I made a bit of progress but now I am stuck with an error due to a drive name being used twice with two different drive ids. They are different folders. I don't have the authority to change any drive names so I need to figure out what to fix in order to access this drive and it's contents. See below for the update and the error:

ORIGINAL INTRO:

So, I can't really create reproducible code but I do believe nearly all SharePoint sites have a Site Contents page that looks something like this:

https://COMPANY.sharepoint.com/teams/TEAM-NAME/_layouts/15/viewlsts.aspx?view=14

I've been tasked to search ALL content on this SharePoint site for specific keywords. I know how to go to https://COMPANY.sharepoint.com/teams/TEAM-NAME/ and get_drive(), which is a document library called Documents. It contains 9 subfolders (or subdrives). But, there are actually a total of 87 document libraries. Now, maybe I just don't understand the structure of Sharepoint (totally) but I need access to the complete 87 document libraries found in Site Contents.

Here's my code, at which once I get to where I confirm there are the 87 I can't seem to be able to get into them to search the contents. So, this is where I need help. I am sure my issue is my fundamental lack of knowledge of SharePoint and Microsoft365R.

UPDATED CODE:

library(tidyverse)
library(Microsoft365R) 
library(stringr)
library(HelpersMG) 
library(AzureAuth)
library(httr)
library(pdftools)  
library(readtext)  
library(jsonlite)

###  1.  Authnticate Sharepoint site
site <-  get_sharepoint_site(site_url="https://COMPANY.sharepoint.com/teams/TEAM-NAME/")

###  2. Get Site Information

# Get SharePoint site ID (needed for API calls)
site_id <- site$properties$id

# List all document libraries within the site
drives <- site$list_drives()

drv <- site$get_drive()

###  3. Use Microsoft Graph API to List All Document Libraries

# Define the Microsoft Graph API endpoint for document libraries
url <- paste0("https://graph.microsoft.com/v1.0/sites/", site_id, "/drives")

# Make API request
response <- GET(url, add_headers(Authorization = paste("Bearer", site$token$credentials$access_token)))

# Parse response
content <- content(response, as = "text", encoding = "UTF-8")
data <- fromJSON(content, flatten = TRUE)  

libraries_df <- data$value
# Convert to data frame if it's not already
if (!is.data.frame(libraries_df)) {
  libraries_df <- as.data.frame(libraries_df, stringsAsFactors = FALSE)
}

# Confirms 87 document libraries
print(libraries_df[, c("id", "name")])

# Display drive names and IDs
for (drv in drives) {
  cat("Drive Name:", drv$properties$name, "\n")
  cat("Drive ID:", drv$properties$id, "\n\n")
}

### 4. Define a Function to Recursively List All Items in a Drive

list_files_recursive <- function(drive_id, parent_path = "") {

  # Retrieve the drive object using its ID
  drive <- site$get_drive(drive_id = drive_id)
  
  # Use list_items() to retrieve items in the specified folder
  items <- drive$list_items(parent_path)
  
  # Separate files and folders
  files <- items[!items$isdir, ]
  folders <- items[items$isdir, ]
  
  # Initialize a result with files
  all_files <- files
  
  # Recursively process each folder
  for (folder in folders$name) {
    folder_path <- file.path(parent_path, folder)
    all_files <- rbind(all_files, list_files_recursive(drive_id, folder_path))
  }
  
  return(all_files)
}


### 5. Iterate Over Each Drive and Collect All Files

# Initialize a list to store all files
all_files <- list()

# Iterate over each drive
for (drv in drives) {
  drive_id <- drv$properties$id
  drive_name <- drv$properties$name
  web_url <- drv$properties$webUrl
  
  # Use the drive_id to fetch files
  library_files <- list_files_recursive(drive_id)
  
  # Store the retrieved files in the all_files list
  all_files[[paste0(drive_name, " (", substr(drive_id, 1, 8), ")")]] <- library_files
}

And here is the error:

Error: Absolute path incompatible with path starting from item ID

So, I know this error is happening because I have two folders with the same name but they are actually different folders. But, even though I am trying to use the drive ID instead of the name I am still getting this error.

I appreciate any assistance!

Upvotes: 0

Views: 48

Answers (0)

Related Questions