bioinformatician
bioinformatician

Reputation: 364

How to automate commands in R?

I have a very basic question.

I am a new user of R, these days i am using one R package for my analysis, i have to run list of R commands of that package to get desired output. I want to make my analysis pipeline and automate it so that i can do my work using one single R command with required parameters.

such type of work we do in shell scripts (where we add number of linux commands, awk/sed/perl lines

please provide me some link on how to do this, i would be thankful.

Upvotes: 2

Views: 8522

Answers (3)

Rob Donnelly
Rob Donnelly

Reputation: 2336

Another option is to run your program with Rscript. The arguments from the command line can be accessed with the function args <- commandArgs(trailingOnly=TRUE) (they are returned as a list)

e.g. using mathematical.coffee's example from above, your script would look like

Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.

args <- commandArgs(trailingOnly=TRUE)
MU <- as.numeric(args[[1]])  # the mean
SD <- as.numeric(args[[2]]) # standard deviation
NUMBER_TO_GENERATE <- as.integer(args[[3]])

rnorm(NUMBER_TO_GENERATE, mean=MU, sd=SD)
doOtherStuff(x)

You could then call your function like Rscript myscript.R 2.0 0.1 100

If you want to do something fancier with the arguments (e.g. --filename) you can use the optparse library. http://www.r-bloggers.com/passing-arguments-to-an-r-script-from-command-lines/

Upvotes: 0

mathematical.coffee
mathematical.coffee

Reputation: 56955

Suppose this was my analysis pipeline: I want to generate 10 numbers from the normal distribution with mean MU and standard deviation SD and then do something else with them:

MU <- 1  # the mean
SD <- .5 # standard deviation
NUMBER_TO_GENERATE <- 10

x <- rnorm(NUMBER_TO_GENERATE, mean=MU, sd=SD)
# ... more analysis here.

At the moment I copy-paste these commands into the R terminal. There are a few ways to "automate" this.

1. Write a function

I encompass my list of commands to execute into one big function, and put my parameters as function parameters:

myFunction <- function( MU, SD, NUMBER_TO_GENERATE ) {
    x <- rnorm(NUMBER_TO_GENERATE, mean=MU, sd=SD)
    # ... rest of analysis
}

Now within R I can just do myFunction(1, .5, 10), reducing the number of commands I have to type to 1.

2. Write a script

I could write a script file myScript.r. This is like a bash script except it's a list of R commands.

I can either put my original list of commands in there, Or I could put my function in there plus an additional statement at the bottom myFunction(1,.5,10).

Then from within R, I can do:

source('myScript.r')

and it will run all the R commands in the script.

3. From the shell

If you want to source this script from the shell, I'd suggest having a file myScript.r with the function inside it.

Then check out Rscript (you can just ?Rscript from within R). This comes installed R by default, and you use it for executing R commands from a unix/windows command line.

For example:

[mathematical.coffee@bar ~]$ Rscript -e '1+1'
[1] 2

In particular, you could combine methods 1) and 2) with Rscript to do something like:

[mathematical.coffee@bar ~]$ Rscript -e 'source("myScript.R"); myFunction( 1, .5, 10 )'

to run your function.

Or you could of course just include the myFunction(1, .5, 10) in your myScript.R, in which case you can just do Rscript myScript.R.

The advantage of the former is if you wanted to do shell scripting (I only mention this because you mentioned bash scripts in your question). In a bash script we could do something like:

#!/bin/bash
MU=1;
SD=.5;
NUM=10;

Rscript -e "source('myScript.r'); myFunction($MU,$SD,$NUM)"

However I'd argue for not mixing bash scripts with R scripts - as I mentioned before, I only mention this option because you mentioned bash/unix scripts in your question.

Upvotes: 14

johannes
johannes

Reputation: 14453

Functions are probably what you are looking for

foo <- function() {
 data <- data.frame(a=1:10, b=10:1)
 plot(data)

  # many more commands here
}

then you can just call foo() and all commands are run.

See the R help for a more in depth informations.

Also source() might be of interest to you, see ?source.

Upvotes: 2

Related Questions