Reputation: 4243
I have a daily email that is sent to me and within that email, there is a download link. When you click on the download link, there is a csv file that gets downloaded with a different file name every time. On top of this, the download link name changes daily.
Is there a way to call text in the body of an email that is a hyperlink, and read that downloaded file to R?
My email looks like this:
My usual code to read anything from an email looks like this:
##Load Libraries
library(readr)
library(RDCOMClient)
library(plotrix)
outlook_app <- COMCreate("Outlook.Application")
search <- outlook_app$AdvancedSearch(
"Inbox",
"urn:schemas:httpmail:subject = 'SUBJECT NAME'"
)
Sys.sleep(5) # Wait a hot sec!
results <- search$Results() # Saves search results into results object
Sys.sleep(5) # Wait a hot sec!
results$Item(1)$ReceivedTime() # Received time of first search result
as.Date("1899-12-30") + floor(results$Item(1)$ReceivedTime()) # Received date
# Iterates through results object to pull out all of the items
for (i in 1:results$Count()) {
if (as.Date("1899-12-30") + floor(results$Item(i)$ReceivedTime())
== as.Date(Sys.Date())) {
email <- results$Item(i)
}
}
attachment_file <- tempfile()
email$Attachments(1)$SaveAsFile(attachment_file)
##Automatically Determine csv file name
file_name<-unzip(attachment_file,list=TRUE)
csv_file<-file_name$Name
##Read CSV File
df <- read_csv(unz(attachment_file,csv_file), skip = 25)
Upvotes: 3
Views: 520
Reputation: 2233
You can consider something like this :
library(RDCOMClient)
library(stringr)
## create outlook object
OutApp <- COMCreate("Outlook.Application")
outlookNameSpace <- OutApp$GetNameSpace("MAPI")
fld <- outlookNameSpace$GetDefaultFolder(6)
# Check that we got the right folder
Cnt <- fld$Items()$Count()
emails <- fld$items
list_Text_Body <- list()
list_URL_Link <- list()
for(i in 1 : Cnt)
{
print(i)
list_Text_Body[[i]] <- emails(i)[["Body"]]
list_URL_Link[[i]] <- stringr::str_extract_all(list_Text_Body[[i]], pattern = "http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+")[[1]]
}
read.csv(list_URL_Link[[1]][[1]])
Upvotes: 1