tef2128
tef2128

Reputation: 780

data frame in user defined function in R

I'm trying to make a function that takes two arguments. One argument is the name of a data frame, and the second is the name of a column in that data frame. The goal is for the function to manipulate data in the whole frame based on information contained in the specified column.

My problem is that I can't figure out how to use the character expression entered into the second argument to access that particular column in the data frame within the function. Here's a super brief example,

datFunc <- function(dataFrame = NULL, charExpres = NULL) {

return(dataFrame$charExpress)

}

If, for instance you enter

datFunc(myData, "variable1")

this does not return myData$variable1. there HAS to be a simple way to do this. Sorry if the question is stupid, but i'd appreciate a little help here.

A related question would be, how do i use the character string "myData$variable1" to actually return variable1 from myData?

Upvotes: 4

Views: 8016

Answers (3)

CHP
CHP

Reputation: 17189

I think OP wants to pass name of dataframe as string too. If that is the case your function should be something like. (borrowed sample from other answer)

fooFunc <- function( dfNameStr, colNamestr, drop=TRUE) {
  df <- get(dfNameStr)
  return(df[,colNamestr, drop=drop])
}


> myData <- data.frame(ID=1:10, variable1=rnorm(10, 10, 1))
> myData
   ID variable1
1   1 10.838590
2   2  9.596791
3   3 10.158037
4   4  9.816136
5   5 10.388900
6   6 10.873294
7   7  9.178112
8   8 10.828505
9   9  9.113271
10 10 10.345151


> fooFunc('myData', 'ID', drop=F)
   ID
1   1
2   2
3   3
4   4
5   5
6   6
7   7
8   8
9   9
10 10
> fooFunc('myData', 'ID', drop=T)
 [1]  1  2  3  4  5  6  7  8  9 10

Upvotes: 3

RJ-
RJ-

Reputation: 3037

Alternatively, you can find the column index of the dataframe:

df <- as.data.frame(matrix(rnorm(100), ncol = 10))
colnames(df) <- sample(LETTERS, 10)

column.index.of.A <- grep("^A$", colnames(df))
df[, column.index.of.A]

Upvotes: 0

Jilber Urbina
Jilber Urbina

Reputation: 61214

You're almost there, try using [ instead of $ for this kind of indexing

  datFunc <- function(dataFrame = NULL, charExpres = NULL, drop=TRUE) {
  return(dataFrame[, charExpres, drop=drop])
  }


# An example
set.seed(1)
myData <- data.frame(ID=1:10, variable1=rnorm(10, 10, 1))  # DataFrame

datFunc(myData, "variable1") # dropping dimensions
[1]  9.373546 10.183643  9.164371 11.595281 10.329508  9.179532 10.487429 10.738325 10.575781  9.694612

datFunc(myData, "variable1", drop=FALSE) # keeping dimensions
   variable1
1   9.373546
2  10.183643
3   9.164371
4  11.595281
5  10.329508
6   9.179532
7  10.487429
8  10.738325
9  10.575781
10  9.694612

Upvotes: 2

Related Questions