user1976836
user1976836

Reputation: 21

creat a new variable within several data frames in R

I have several data frames df1, df, 2...., df10. Columns (variables) are the same in all of them.

I want to create a new variable within each of them. I can easily do it "manually" as follows:

df1$newvariable <- ifelse(df1$oldvariable == 999, NA, df1$oldvariable)

or, alternatively

df1 = transform(df1, df1$newvariable= ifelse(df1$oldvariable==999, NA, df1$oldvariable)))

Unfortunately I'm not able to do this in a loop. If I write

for (i in names) { #names is the list of dataframes
  i$newvariable <- ifelse(i$oldvariable == 999, NA, i$oldvariable)
}

I get the following output

Error in i$oldvariable : $ operator is invalid for atomic vectors

Upvotes: 2

Views: 1192

Answers (2)

IRTFM
IRTFM

Reputation: 263481

This has been asked many times. The $<- is not capable of translating that "i" index into either the first or second arguments. The [[<- is capable of doing so for the second argument but not the first. You should be learning to use lapply and you will probably need to do it with two nested lapply's, one for the list of "names" and the other for each column in the dataframes. The question is incomplete since it lacks specific examples. Make up a set of three dataframes, set some of the values to "999" and provide a list of names.

Upvotes: 1

Arun
Arun

Reputation: 118889

What I'd do is to pool all data.frame on to a list and then use lapply as follows:

df1 <- as.data.frame(matrix(runif(2*10), ncol=2))
df2 <- as.data.frame(matrix(runif(2*10), ncol=2))
df3 <- as.data.frame(matrix(runif(2*10), ncol=2))
df4 <- as.data.frame(matrix(runif(2*10), ncol=2))

# create a list and use lapply
df.list <- list(df1, df2, df3, df4)
out <- lapply(df.list, function(x) {
    x$id <- 1:nrow(x)
    x
})

Now, you'll have all the data.frames with a new column id appended and out is a list of data.frames. You can access each of the data.frames with x[[1]], x[[2]] etc...

Upvotes: 4

Related Questions