Reputation: 842
I'm really newbie with R. I've this function:
Myfunction <- function (Amean, Asd, An, Bmean, Bsd, Bn, NameOfMYdf)
{
NameOfMYdf$Difference <- (Amean - Bmean)
NameOfMYdf$SDP <- sqrt((((An - 1) * Asd^2 + (Bn - 1) * Bsd^2) / (An + Bn - 2)))
}
when i call this function I'd like to input my variables name from my dataset.
with this function i wanna create in the same dataset 2 new variables:
NameOfMYdf$Difference
NameOfMYdf$SDP
I think it's very easy to to but i cannot figure out
thanks folks
Is there no way to input my variables name and compute?... so i wanna input
Myfunction (meanGroupA, sdGroupA, nGroupA, meanGroupA, sdGroupB, nGroupB, NameOfMyDataset)
Basically, I wanna pass the dataset name with the function
thanks
Upvotes: 0
Views: 66
Reputation: 146040
Your question is coming at R from a typical object-oriented perspective, where you have a function/method that modifies an object. (It looks like you want MyFunction
to add columns to whatever data.frame you give it.)
R is a functional programming language which means it tends to not do this. There are ways to make it happen, but they're difficult to use well and are generally considered bad practice.
Let's do a quick example in an R-like way:
# sample data
mydata <- data.frame(a = rnorm(10), b = runif(10))
Then let's say there's a function of two columns that you want to do a lot
common_task <- function(x, y) {
((x - 1) * y + (y - 1) * x) / (x + y - 2)
}
The easiest/most common way to add this to your data.frame is
mydata$calc <- common_task(x = mydata$a, y = mydata$b)
If you want to use variable names, then strings work well. If your task will always be performed on a data.frame with columns named a
and b
, then you can right a function assuming the data.frame has those column names:
common_task2 <- function(data) {
((data$a - 1) * data$b + (data$b - 1) * data$a) /
(data$a + data$b - 2)
}
A better way is to let the columns names be input as strings, but for this the $
subset shortcut won't work, we need to use [
.
common_task3 <- function(data, x = "a", y = "b") {
((data[, x] - 1) * data[, y] + (data[, y] - 1) * data[, x]) / (data[, x] + data[, y] - 2)
}
This last function will assume the column names you want to work on are "a" and "b", unless you tell it otherwise.
However, in all three cases, the function just returns a new column. To get it in your data.frame outside of the function, you need to assign it, i.e.,
mydata$new_col3 <- common_task3(data = mydata)
mydata$new_col2 <- common_task2(data = mydata)
You could assign the columns inside the function, but you'll still need to assign the results to a data.frame, it won't just modify the data.frame outside of your function:
common_task4 <- function(data, x = "a", y = "b") {
data$result <-((data[, x] - 1) * data[, y] + (data[, y] - 1) * data[, x]) /
(data[, x] + data[, y] - 2)
return(data)
}
my_modified_data <- common_task4(data = mydata)
In all of these cases, there are nice functions that can do this for you. @Jilber's answer recommends transform
, which is a good one. The dplyr
library is also very nice and easy to use. You can write your own versions, but the existing ones will usually be faster and more robust.
For lots more detail and examples, see Advanced R Programming: Functions.
Upvotes: 1
Reputation: 61214
I'd use transform
to make what you want, take a look at this simple example:
> df <- head(mtcars)[, 1:3]
> transform(df, # a data.frame to be transformed
Difference=mpg-cyl, # first transformation
DSP=disp^2) # replace with sqrt((((An - 1) * Asd^2 + (Bn - 1) * Bsd^2) / (An + Bn - 2)))
mpg cyl disp Difference DSP
Mazda RX4 21.0 6 160 15.0 25600
Mazda RX4 Wag 21.0 6 160 15.0 25600
Datsun 710 22.8 4 108 18.8 11664
Hornet 4 Drive 21.4 6 258 15.4 66564
Hornet Sportabout 18.7 8 360 10.7 129600
Valiant 18.1 6 225 12.1 50625
Upvotes: 1