C_Mu
C_Mu

Reputation: 325

How R write a Macro like SAS

I have a really long sqlquery like this, and need to run a few times by only changing the date. I am good at SAS but quite new to R, so I am struggling to write something similar to SAS.

df <- sqlQuery(datamart, paste0("Select xxxxxxxxxxxxxxxxxx from xxxxx
where date = '28Feb2018'"), as.is=TRUE, stringsAsFactors = FALSE)

Can you share some experience please?

Thank you!


edited on 6/5/2018:

I edited the code according to the answers below and I still have some trouble in running the code correctly. currently my code become:

safeqry <- function(date_string)
{require(RODBCext)
qry_string <- paste0("DELETE FROM [T_SPP] WHERE [BkDt]<=#?","#")
  parms <- data.frame(date_string, stringsAsFactors=FALSE)
  sqlExecute(access, qry_string, parms, fetch=TRUE)}
safeqry('2016-04-30')

error is :

42000 -3100 [Microsoft][ODBC Microsoft Access Driver] Syntax error in date in query expression '[BkDt]<=#Pa_RaM000'. [RODBCext] Error: SQLExecute failed In addition: Warning message: In sqlExecute(access, qry_string, parms, fetch = TRUE) :

the other code is

query_delete = function (table, date_col, date) {
paste0('DELETE FROM [',table,'] WHERE [',date_col,']<=#',date, '#')}
sqlQuery(access, query_delete("T_SPP", "BkDt", "2018-04-30"),as.is = TRUE, stringsAsFactors = FALSE)

error is

[1] "[RODBC] ERROR: Could not SQLExecDirect 'DELETE FROM [T_SPP] WHERE [BkDt]<=#2018-04-30#'"

Upvotes: 2

Views: 1354

Answers (3)

r2evans
r2evans

Reputation: 160952

The simplest method looks something like this:

macro1 <- function(dt) {
  qry <- paste0("select xxxxxxxxxx from xxxx where date='", dt, "'")
  sqlQuery(datamart, qry, as.is=TRUE, stringsAsFactors=FALSE)
}

But there are a couple of things wrong with it:

  • it assumes the connection object datamart is available in the parent (and/or global) environment and is valid; if you are testing with different connections, I guarantee this will bite you in ways you do not expect; and
  • it is very prone to SQL injection (comic: xkcd 327)
  • if your argument is empty or length 2 or more, it may not do what you want

A slightly more robust function is something like:

macro2 <- function(dt, con) {
  if (length(dt) == 0L) {
    stop("'dt' is not length 1")
  } else if (length(dt) > 1L) {
    warning("'dt' has length > 1 and only the first element will be used")
    dt <- dt[[1L]]
  }
  qry <- sprintf("select xxxxxxxxxx from xxxx where date='%s'", sQuote(dt))
  sqlQuery(con, qry, as.is=TRUE, stringsAsFactors=FALSE)
}

though a better solution would use variable binding ("parameterized queries"), which is unique to each database-type. As Hong Ooi suggested, RODBCext provides this for RODBC connections, but you'll need something more database-specific otherwise.

If you want to be a little lazy and feel safe that the connection will always be in the global, you might be tempted to do something like:

macro2 <- function(dt, con=datamart) ...

which will work, but I still guard against it. Experience shows that explicit is often much safer and easier to troubleshoot.

From here, a loop can be used, whether a for loop as JasonAizkalns suggested, or, perhaps something like:

answers <- lapply(vector_of_dates, macro2)

Upvotes: 3

Hong Ooi
Hong Ooi

Reputation: 57696

The R equivalent to a SAS macro is a function. So you write a function that takes the date as a parameter, and then pass the date to the query.

The simplest way to do it is via string manipulation:

qry <- function(date_string)
{
    qry_string <- paste0("select xxxxx from yyy where date = '", date_string, "'")
    sqlQuery(datamart, qry_string, as.is=TRUE, stringsAsFactors=FALSE)
}

HOWEVER, this is generally unsafe because people can pass malicious strings to your function, causing it to do Bad Things. Instead, consider using the RODBCext package to run parameterized queries rather than messing with strings:

safeqry <- function(date_string)
{
    require(RODBCext)
    qry_string <- paste0("select xxxxx from yyy where date = ?")
    parms <- data.frame(date_string, stringsAsFactors=FALSE)
    sqlExecute(datamart, qry_string, parms, fetch=TRUE)
}

Upvotes: 5

JasonAizkalns
JasonAizkalns

Reputation: 20483

There are several ways to do this, but since you said you are new to R, this might be a natural approach. First, create function which allows you to change the select_cols, tbl, and date and returns a string:

make_query <- function(select_cols, tbl, date) {
  paste0("SELECT ", select_cols, " FROM ", tbl, " WHERE date = '", date, "';")
}

make_query("*", "my_table", "28Feb2018")
[1] "SELECT * FROM my_table WHERE date = '28Feb2018';"
make_query("*", "different_table", "28Feb2018")
[1] "SELECT * FROM different_table WHERE date = '28Feb2018';"

Then you can create a vector of dates to loop through:

various_dates <- c("28Feb2018", "01Mar2018", "02Mar2018")

for (date in seq_along(various_dates)) {
  make_query("*", "my_table", various_dates[date])
}

Of course, you can modify the body of the loop to use your sqlQuery function:

for (date in seq_along(various_dates)) {
  sqlQuery(datamart, make_query("*", "my_table", various_dates[date]), 
           as.is = TRUE, stringsAsFactors = FALSE)
}

And since it looks like you want to save the results, you can pre-allocate an empty list the same length as the number of dates and save those results:

df <- vector("list", length(various_dates))
for (date in seq_along(various_dates)) {
  df[[date]] <- sqlQuery(datamart, make_query("*", "my_table", various_dates[date]), 
           as.is = TRUE, stringsAsFactors = FALSE)
}

Upvotes: 1

Related Questions