Andrew
Andrew

Reputation: 9190

String continuation across multiple lines, no newline characters

Am using the RODBC library to bring data into R. I have a long query that I want to pass a variable to, much like this SO user.

Problem is that R interprets the whitespace/carriage returns in my query as a newline '\n'.

The accepted solution for this question suggests to simply break up the text into chunks and then paste() together - which works, but ideally I'd like to keep the whitespace intact - makes it easier to test/verify the behavior of the query over in the database before pasting into R.

In other languages I'm familiar with there's a simple line continuation character - indeed, several of the comments on the accepted answer are looking for an approach similar to python's \.

I found an aside to a workaround using strwrap deep in the bowels of an R discussion lists, so in the interest of making the internet better I will post it here. However, if someone can point the direction toward a more elegant/straightforward solution, I will happily accept your answer.

Upvotes: 15

Views: 19788

Answers (4)

C8H10N4O2
C8H10N4O2

Reputation: 19005

query <- gsub(pattern='\\s',replacement="",x=query)

Upvotes: 11

Andrew
Andrew

Reputation: 9190

R's strwrap will destroy whitespace, including newline characters, per the documentation.

Essentially, you can get the desired behavior by initially letting R introduce line breaks/newline \ns, and then immediately stripping them out.

#make query using PASTE
query_1 <- paste("SELECT map.ps_studentid 
      ,students.first_name || ' ' || students.last_name AS full_name
      ,map.testritscore
      ,map.termname
      ,map.measurementscale
FROM map$comprehensive_with_growth map
JOIN students 
  ON map.ps_studentid = students.id
WHERE map.termname = '",map_term,"'", sep='')

#remove newline characters introduced above.  
#width is an arbitrary big number-
#it just needs to be longer than your string.
query_1 <- strwrap(query_1, width=10000, simplify=TRUE)

#execute the query 
map_njask <- sqlQuery(XE, query_1)

Upvotes: 14

jonsedar
jonsedar

Reputation: 314

Try using sprintf to get variable substitution, and then replacing all newlines and whitespace.

See my answer to a similar question for details.

Upvotes: 0

joran
joran

Reputation: 173587

I don't know if you will find this helpful or not, but I have eventually gravitated towards keeping my SQL separate from my R scripts. Keeping the query in my R script, except for very very short ones, I find gets unreadable very quickly.

These days, I tend to keep queries that are more than a single line in their own separate .sql file. Then I can keep them nice and formatted and readable in a nice text editor, and read them into R as needed via something like this:

read_sql <- function(path){
    stopifnot(file.exists(path))
    sql <- readChar(path,nchar = file.info(path)$size)
    sql
}

For binding parameters into the queries, I just keep a %s where the parameter will go in the .sql file, and then add in the parameters in R using sprintf.

I've been much happier this way, as I was finding that cluttering up my R scripts with really long paste statements and multi-line character objects was making my code really hard to read.

Upvotes: 18

Related Questions