Nick Knauer
Nick Knauer

Reputation: 4243

Download Link from email and call that file into R

I have a daily email that is sent to me and within that email, there is a download link. When you click on the download link, there is a csv file that gets downloaded with a different file name every time. On top of this, the download link name changes daily.

Is there a way to call text in the body of an email that is a hyperlink, and read that downloaded file to R?

My email looks like this:

enter image description here

My usual code to read anything from an email looks like this:

##Load Libraries
library(readr)
library(RDCOMClient)
library(plotrix)
outlook_app <- COMCreate("Outlook.Application")
search <- outlook_app$AdvancedSearch(
  "Inbox",
  "urn:schemas:httpmail:subject = 'SUBJECT NAME'"
)

Sys.sleep(5) # Wait a hot sec!


results <- search$Results() # Saves search results into results object

Sys.sleep(5) # Wait a hot sec!

results$Item(1)$ReceivedTime() # Received time of first search result

as.Date("1899-12-30") + floor(results$Item(1)$ReceivedTime()) # Received date

# Iterates through results object to pull out all of the items
for (i in 1:results$Count()) {
  if (as.Date("1899-12-30") + floor(results$Item(i)$ReceivedTime()) 
      == as.Date(Sys.Date())) {
    email <- results$Item(i)
  }
}

attachment_file <- tempfile()
email$Attachments(1)$SaveAsFile(attachment_file)

##Automatically Determine csv file name
file_name<-unzip(attachment_file,list=TRUE)
csv_file<-file_name$Name

##Read CSV File
df <- read_csv(unz(attachment_file,csv_file), skip = 25)

Upvotes: 3

Views: 520

Answers (1)

Emmanuel Hamel
Emmanuel Hamel

Reputation: 2233

You can consider something like this :

library(RDCOMClient)
library(stringr)

## create outlook object
OutApp <- COMCreate("Outlook.Application")
outlookNameSpace <- OutApp$GetNameSpace("MAPI")
fld <- outlookNameSpace$GetDefaultFolder(6)

# Check that we got the right folder
Cnt <- fld$Items()$Count()
emails <- fld$items
list_Text_Body <- list()
list_URL_Link <- list()

for(i in 1 : Cnt)
{
  print(i)
  list_Text_Body[[i]] <- emails(i)[["Body"]]
  list_URL_Link[[i]] <- stringr::str_extract_all(list_Text_Body[[i]], pattern = "http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\\(\\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+")[[1]]
}

read.csv(list_URL_Link[[1]][[1]])

Upvotes: 1

Related Questions