Have R source() preserve empty lines

Question

I would like 'source()' to reproduce the input file. This almost gets there:

source(in.file, echo=TRUE, max.deparse=100000, prompt.echo="", print.eval=FALSE)

Unfortunately it discards empty lines! I don't see this behaviour mentioned in the documentation. Is it a bug?

To preserve the readability of the source, and avoid spurious version control changes, I'd like to reproduce it exactly. How can I achieve it?

I could use a hack: first replace all empty lines with "#empty", then run 'source()' in R, then reverse the substitution in the output. But that's circuitous and ugly.

Example input. In real use I'd define 'foo' in my R session, before calling 'source', but to keep this self-contained, I'm including it here:

foo <- function(x) {
   if (x > 100) cat ("# Wow!
")
}
# comment kept.
# empty lines?

foo(2)
foo(4)
foo(16)
foo(256)


foo(2015)
foo(9)
foo(22)
# the end

Desired output -- I'd like to annotate any large 'x' with "Wow":

foo <- function(x) {
   if (x > 100) cat ("# Wow!
")
}
# comment kept.
# empty lines?

foo(2)
foo(4)
foo(16)
foo(256)
# Wow!


foo(2015)
# Wow!
foo(9)
foo(22)
# the end

Actual output:

foo <- function(x) {
   if (x > 100) cat ("# Wow!
")
}

# comment kept.
# empty lines?

foo(2)

foo(4)

foo(16)

foo(256)
# Wow!

foo(2015)
# Wow!

foo(9)

foo(22)

# the end

I get an empty line after every statement or comment block. Empty lines in the input are lost.

Thomas · Accepted Answer

I see what you're saying here, but I think you misunderstand what source() is actually doing. First, it reads in a file using readLines(), then in parses it into series of R expressions, then it prints out the parsed, evaluated expressions (possibly with the original parsed expressions at a prompt, and possibly other information if verbose = TRUE). To understand this, let's take this in steps and see what's happening:

Step 1. Read in file:

> readLines(in.file)
 [1] "foo <- function(x) {"                "   if (x > 100) cat ("# Wow!\n")"
 [3] "}"                                   "# comment kept."                    
 [5] "# empty lines?"                      ""                                   
 [7] "foo(2)"                              "foo(4)"                             
 [9] "foo(16)"                             "foo(256)"                           
[11] ""                                    ""                                   
[13] "foo(2015)"                           "foo(9)"                             
[15] "foo(22)"                             "# the end"                          
[17] ""                                    ""

Step 2. Parse expressions:

> parse(in.file)
expression(foo <- function(x) {
   if (x > 100) cat ("# Wow!
")
}, foo(2), foo(4), foo(16), foo(256), foo(2015), foo(9), foo(22))

Step 3. Evaluate and print:

Now, this is where the problem is for you. Because source() has created an R parse tree from the read-in file, the original structure of the file is basically completely lost. (It's a little more complicated then this actually, but you can look at the source code for yourself to see that.) To achieve your desired output, you need to change the way that R parses, evaluates, and prints each expression.

That is a lot of work, but you can make one simple change related to Step 3 that you might be happy with. Near the bottom of source(), you'll find the following:

        if (nd) {
            do.trunc <- nd > max.deparse.length
            dep <- substr(dep, 1L, if (do.trunc) 
              max.deparse.length
            else nd)
            cat("
", dep, if (do.trunc) 
              paste(if (length(grep(sd, dep)) && length(grep(oddsd, 
                dep))) 
                " ..." ..."
              else " ....", "[TRUNCATED] "), "
", sep = "")
        }

If you remove the " " from cat(" ", ... as follows:

        if (nd) {
            do.trunc <- nd > max.deparse.length
            dep <- substr(dep, 1L, if (do.trunc) 
              max.deparse.length
            else nd)
            cat(dep, if (do.trunc) 
              paste(if (length(grep(sd, dep)) && length(grep(oddsd, 
                dep))) 
                " ..." ..."
              else " ....", "[TRUNCATED] "), "
", sep = "")
        }

You'll get something closer to your intended result:

> source(in.file, echo=TRUE, max.deparse=100000, prompt.echo="", print.eval=FALSE)
foo <- function(x) {
+    if (x > 100) cat ("# Wow!
")
+ }
# comment kept.
# empty lines?

foo(2)
foo(4)
foo(16)
foo(256)
# Wow!
foo(2015)
# Wow!
foo(9)
foo(22)
# the end

But if you actually want to preserve whitespace exactly as in the original input file, you're going to have to change Step 2 (i.e., change the way R parses and evaluates the input file so that, basically, it evaluates an empty line to be cat(" "). That's probably a lot of work to achieve.

Have R source() preserve empty lines

Answers (2)

Step 1. Read in file:

Step 2. Parse expressions:

Step 3. Evaluate and print:

Related Questions