Reputation: 81
I've been struggling to figure this out on my own, so reaching out for some assistance. I am trying to build urls based on multiple variables (months and years) of different lengths so that I have a url for each combination of month and year from the lists I created.
I've done something similar in Python but need to translate it into R, and I'm running into issues with building the function and for loops. Here's the Python code ..
# set years and months
oasis_market_yr = ('2020','2019','2018','2017','2016','2015','2014','2013','2012','2011')
oasis_market_mn = ('01','02','03','04','05','06','07','08','09','10','11','12')
# format url string
URL_FORMAT_STRING = 'http://oasis.caiso.com/oasisapi/SingleZip?queryname=CRR_INVENTORY&market_name=AUC_MN_{year}_M{month}_TC&resultformat=6&market_term=ALL&time_of_use=ALL&startdatetime={year}{month}01T07:00-0000&enddatetime={year}{month}{last_day_of_month}T07:00-0000&version=1'
# create function to make urls
def make_url(year,month):
last_day_of_month = calendar.monthrange(int(year), int(month))[1]
return URL_FORMAT_STRING.format(year=year,month=month,last_day_of_month=last_day_of_month)
# build urls for download
for y in oasis_market_yr:
for m in oasis_market_mn:
url = make_url(y,m)
I've tried using sapply and mapply with str_glue and a few other methods but can't seem to replicate the outcome. I keep getting an error that reads: Error: Variables must be length 1 or 5
. Or, for instance with mapply, it maps the first value in one list to the first in the other list and so on, then returns when the short list runs out of values. What I need is all the combinations from both lists.
Any assistance would be much appreciated.
Upvotes: 1
Views: 94
Reputation: 4243
An option using glue
and lubridate
. Note I added _i
to the {month}
and {year}
variables to avoid confusion with the month
and year
functions in lubridate
.
library(glue)
library(lubridate)
URL_FORMAT_STRING <- 'http://oasis.caiso.com/oasisapi/SingleZip?queryname=CRR_INVENTORY&market_name=AUC_MN_{year_i}_M{month_i}_TC&resultformat=6&market_term=ALL&time_of_use=ALL&startdatetime={year_i}{month_i}01T07:00-0000&enddatetime={year_i}{month_i}{last_day_of_month}T07:00-0000&version=1'
make_url<- function(year_i, month_i){
last_day_of_month <- day(ceiling_date(my(paste(month_i, year_i)), 'month') - days(1))
glue(URL_FORMAT_STRING)
}
And then rather than a nested for loop you can use mapply
to apply your function to all combinations of oasis_market_yr
and oasis_market_mn
.
df_vars <- expand.grid(year_i = oasis_market_yr, month_i = oasis_market_mn)
mapply(make_url, df_vars$year_i, df_vars$month_i)
# [1] "http://oasis.caiso.com/oasisapi/SingleZip?queryname=CRR_INVENTORY&market_name=AUC_MN_2020_M01_TC&resultformat=6&market_term=ALL&time_of_use=ALL&startdatetime=20200101T07:00-0000&enddatetime=20200131T07:00-0000&version=1"
# [2] "http://oasis.caiso.com/oasisapi/SingleZip?queryname=CRR_INVENTORY&market_name=AUC_MN_2019_M01_TC&resultformat=6&market_term=ALL&time_of_use=ALL&startdatetime=20190101T07:00-0000&enddatetime=20190131T07:00-0000&version=1"
#....
Upvotes: 0
Reputation: 2894
Your syntax was a little too python and won't work like that in R.
In R, the same syntax would look like this:
# set years and months
oasis_market_yr = c('2020','2019','2018','2017','2016','2015','2014','2013','2012','2011')
oasis_market_mn = c('01','02','03','04','05','06','07','08','09','10','11','12')
# create function to make urls
make_url = function(year,month){
# format url string
URL_FORMAT_STRING = 'http://oasis.caiso.com/oasisapi/SingleZip?queryname=CRR_INVENTORY&market_name=AUC_MN_{year}_M{month}_TC&resultformat=6&market_term=ALL&time_of_use=ALL&startdatetime={year}{month}01T07:00-0000&enddatetime={year}{month}{last_day_of_month}T07:00-0000&version=1'
lastdays = c(31,28,31,30,31,30,31,31,30,31,30,31)
if(as.integer(year)%%4==0 & as.integer(year)%%100 !=0){lastdays[2]=29}
last_day_of_month = as.character(lastdays[as.integer(month)])
fs = gsub("{month}",month,URL_FORMAT_STRING, fixed=T)
fs = gsub("{year}",year,fs, fixed=T)
fs = gsub("{last_day_of_month}",last_day_of_month, fs, fixed=T)
return(fs)
}
# build urls for download
for(y in oasis_market_yr){
for(m in oasis_market_mn){
url = make_url(y,m)
print(url)
}
}
As I am not aware of a direct correspondence of the string formatting method in R, I changed it to replacements (a = gsub(pattern, replacement, a)
corresponds the python command a=a.replace(pattern,replacement)
. It should work beautifully.
Also, you don't really need a calendar package to get the last dates. Just offer it as a list and adjust it for leap days and Bob's your uncle.
I don't know whether the URLs that are generated are really the ones you need. But you might be able to work from this translation to correct it, if something is wrong.
Upvotes: 0