Reputation: 1
I've been reading in large numbers (1,000) of Excel files from Google Drive using readxl on a Windows machine for 6 months or so. I recently got a new MacBook and tried running the same code there - expecting it to be significantly faster. The only thing I changed was setwd("G:/...") to setwd("Users/...") to reflect the different format.
It turns out the job was MUCH slower - so slow that I terminated it. When I looked at the Google Drive status, it appears that prior to processing the Excel file, each file is being synched to the local machine, despite the Google Drive settings on the Mac set to "Stream Files" (Control Center -> Google Drive Preferences -> Google Drive -> Stream Files). No wonder it is taking so long, as it seems each individual file is being downloaded to my local drive before processing! This was not the behavior on Windows, where Google Drive preferences are also set to "Stream Files."
Anybody else encounter this? Any ideas on any settings I should change or potential workarounds? I guess I could download all the files to the local drive first, but that seems suboptimal.
Thanks in advance for any tips or workarounds you can share.
Windows 11: R 4.4.2, readxl 1.4.3, data.table 1.16.4 MacOS 15.3: R 4.4.2, readxl 1.4.3, data.table 1.16.4
# Relevant code snippet to read Excel follows
# Loop through each file in the directory
for (file in files) {
# Retrieve sheet names from Excel file
dts <- excel_sheets(file)
# Check if the file contains a tab called "Paste Here"
if ("Paste Here" %in% dts) {
# Read the file into a data.table
dt <- as.data.table(read_excel(file, sheet = "Paste Here"))
...
Upvotes: 0
Views: 44