Primo Petri
Primo Petri

Reputation: 434

Is there in R something like the "here document" in bash?

My script contains the line

lines <- readLines("~/data")

I would like to keep the content of the file data (verbatim) in the script itself. Is there in R a "read_the_following_lines" function? Something like to the "here document" in the bash shell?

Upvotes: 5

Views: 1710

Answers (5)

underscore
underscore

Reputation: 67

What about some more recent tidyverse syntax?

SQL <- c("
SELECT * FROM patient
LEFT OUTER JOIN projectpatient ON patient.patient_id = projectpatient.patient_id 
WHERE projectpatient.project_id = 16;
") %>% stringr::str_replace_all("[\r\n]"," ")

Upvotes: 0

Lucio Queiroz
Lucio Queiroz

Reputation: 25

Since R v4.0.0, there is a new syntax for raw strings, as stated in changelogs, that largely allows heredocs style documents to be created.

Additionally, from help(Quotes):

The delimiter pairs [] and {} can also be used, and R can be used in place of r. For additional flexibility, a number of dashes can be placed between the opening quote and the opening delimiter, as long as the same number of dashes appear between the closing delimiter and the closing quote.

As an example, one can use (on a system with BASH shell):

file_raw_string <-
r"(#!/bin/bash
echo $@
for word in $@;
do
  echo "This is the word: '${word}'."
done
exit 0
)"

writeLines(file_raw_string, "print_words.sh")

system("bash print_words.sh Word/1 w@rd2 LongWord composite-word")

or even another R script:

file_raw_string <- r"(
x <- lapply(mtcars[,1:4], mean)
cat(
  paste(
    "Mean for column", names(x), "is", format(x,digits = 2),
    collapse = "\n"
  )
)
cat("\n")
cat(r"{ - This is a raw string where \n, "", '', /, \ are allowed.}")
)"

writeLines(file_raw_string, "print_means.R")

source("print_means.R")

#> Mean for column mpg is 20
#> Mean for column cyl is 6.2
#> Mean for column disp is 231
#> Mean for column hp is 147
#>  - This is a raw string where \n, "", '', /, \ are allowed.

Created on 2021-08-01 by the reprex package (v2.0.0)

Upvotes: 2

russellpierce
russellpierce

Reputation: 4711

A way to do multi-line strings but not worry about quotes (only backticks) you can use:

as.character(quote(`
all of the crazy " ' ) characters, except 
backtick and bare backslashes that aren't 
printable, e.g. \n works but a \ and c with no space between them would fail`))

Upvotes: 0

akraf
akraf

Reputation: 3255

Pages 90f. of An introduction to R state that it is possible to write R scripts like this (I quote the example modified from there):

chem <- scan()
2.90 3.10 3.40 3.40 3.70 3.70 2.80 2.50 2.40 2.40 2.70 2.20
5.28 3.37 3.03 3.03 28.95 3.77 3.40 2.20 3.50 3.60 3.70 3.70

print(chem)

Write these lines into a file, and give it the name, say, heredoc.R. If you then execute that script non-interactively by typing in your terminal

Rscript heredoc.R

you will get the following output

Read 24 items
 [1]  2.90  3.10  3.40  3.40  3.70  3.70  2.80  2.50  2.40  2.40  2.70  2.20
[13]  5.28  3.37  3.03  3.03 28.95  3.77  3.40  2.20  3.50  3.60  3.70  3.70

So you see that the data provided in the file are saved in the variable chem. The function scan(.) reads from the connection stdin() per default. stdin() refers to user input from the console in interactive mode (a call to R without specified script), but when an input script is read in, the following lines of that script are read *). The empty line after the data is important because it marks the end of the data.

This also works with tabular data:

tab <- read.table(file=stdin(), header=T)
A B C
1 1 0
2 1 0
3 2 9

summary(tab)

When using readLines(.), you must specify the number of lines read; the approach with the empty line does not work here:

txt <- readLines(con=stdin(), n=5)                                             
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi ultricies diam   
sed felis mattis, id commodo enim hendrerit. Suspendisse iaculis bibendum eros, 
ut mattis eros interdum sit amet. Pellentesque condimentum eleifend blandit. Ut 
commodo ligula quis varius faucibus. Aliquam accumsan tortor velit, et varius   
sapien tristique ut. Sed accumsan, tellus non iaculis luctus, neque nunc        

print(txt) 

You can overcome this limitation by reading one line at a time until one line is empty or some other predefined string. Note however, that you may run out of memory if you read a large (>100MB) file this way, because each time you append a string to your read-in data, all the data is copied to another place in memory. See the chapter "Growing objects" in The R inferno:

txt <- c()
repeat{
    x <- readLines(con=stdin(), n=1)
    if(x == "") break # you can use any EOF string you want here
    txt = c(txt, x)
}
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi ultricies diam
sed felis mattis, id commodo enim hendrerit. Suspendisse iaculis bibendum eros,
ut mattis eros interdum sit amet. Pellentesque condimentum eleifend blandit. Ut
commodo ligula quis varius faucibus. Aliquam accumsan tortor velit, et varius
sapien tristique ut. Sed accumsan, tellus non iaculis luctus, neque nunc

print(txt)

*) If you want to read from standard input in an R script, for example because you want to create a reusable script which can be called with any input data (Rscript reusablescript.R < input.txt or some-data-generating-command | Rscript reusablescript.R), use not stdin() but file("stdin").

Upvotes: 1

hrbrmstr
hrbrmstr

Reputation: 78832

Multi-line strings are going to be as close as you get. It's definitely not the same (since you have to care about the quotes) but it does work pretty well for what you're trying to achieve (and you can do it with more than read.table):

here_lines <- 'line 1
line 2
line 3
'

readLines(textConnection(here_lines))

## [1] "line 1" "line 2" "line 3" ""


here_csv <- 'thing,val
one,1
two,2
'

read.table(text=here_csv, sep=",", header=TRUE, stringsAsFactors=FALSE)

##   thing val
## 1   one   1
## 2   two   2


here_json <- '{
"a" : [ 1, 2, 3 ],
"b" : [ 4, 5, 6 ],
"c" : { "d" : { "e" : [7, 8, 9]}}
}
'

jsonlite::fromJSON(here_json)

## $a
## [1] 1 2 3
## 
## $b
## [1] 4 5 6
## 
## $c
## $c$d
## $c$d$e
## [1] 7 8 9

here_xml <- '<CATALOG>
<PLANT>
<COMMON>Bloodroot</COMMON>
<BOTANICAL>Sanguinaria canadensis</BOTANICAL>
<ZONE>4</ZONE>a
<LIGHT>Mostly Shady</LIGHT>
<PRICE>$2.44</PRICE>
<AVAILABILITY>031599</AVAILABILITY>
</PLANT>
<PLANT>
<COMMON>Columbine</COMMON>
<BOTANICAL>Aquilegia canadensis</BOTANICAL>
<ZONE>3</ZONE>
<LIGHT>Mostly Shady</LIGHT>
<PRICE>$9.37</PRICE>
<AVAILABILITY>030699</AVAILABILITY>
</PLANT>
</CATALOG>
'

str(xml <- XML::xmlParse(here_xml))

## Classes 'XMLInternalDocument', 'XMLAbstractDocument' <externalptr>

print(xml)

## <?xml version="1.0"?>
## <CATALOG>
##   <PLANT><COMMON>Bloodroot</COMMON><BOTANICAL>Sanguinaria canadensis</BOTANICAL><ZONE>4</ZONE>a
## <LIGHT>Mostly Shady</LIGHT><PRICE>$2.44</PRICE><AVAILABILITY>031599</AVAILABILITY></PLANT>
##   <PLANT>
##     <COMMON>Columbine</COMMON>
##     <BOTANICAL>Aquilegia canadensis</BOTANICAL>
##     <ZONE>3</ZONE>
##     <LIGHT>Mostly Shady</LIGHT>
##     <PRICE>$9.37</PRICE>
##     <AVAILABILITY>030699</AVAILABILITY>
##   </PLANT>
## </CATALOG>

Upvotes: 5

Related Questions