Reputation: 1
I am new to R, so please bear with me.
I have two dataframes:
df1 <- data.frame(name = c("name 1", "name 2", "name 3", "name 4"),
columnname = c("hello", "", "hello", ""))
df2 <- data.frame(name = c("name 1", "name 2", "name 3"),
columnname = c(1, 2, 3))
Which looks like:
df1
#name columnname
#name 1 hello
#name 2
#name 3 hello
#name 4
df2
#name columnname
#name 1 1
#name 2 2
#name 3 3
My goal is to replace the value "hello" in df1 with the corresponding value in df2 (and NA otherwise), and create a new dataframe, df3. So far I have the following code:
fun <- function(cat_df, ret_df, col_name) {
ret_df[, col_name] <- ifelse(cat_df[, col_name] == "hello", ret_df[, col_name],"NA")
return(ret_df)
}
df3 <- fun(df1, df2, col_name = "columnname")
df3
#name columnname
#name 1 1
#name 2 NA
#name 3 3
#name 4 NA
However, I have 350 columns and 3000 rows. So my question is, how can I expand the code to hold a dataframe of 350 columns and 3000 rows? Other types of code are very welcome!
Upvotes: 0
Views: 60
Reputation: 1
So if my two data frames have dimensions of:
dim(df1)
639 260
and dim(df2)
2273 260
Would the code then look like:
set.seed(4)
nobs=2273
df1 <- data.frame(name=paste("name",1:nobs))
df1[,paste0("col",1:260)] <- sample(c("hello",""),260*nobs,T)
df2 <- data.frame(name=paste("name",1:nobs))
df2[,paste0("col",1:260)] <- 1:nobs
mycols <- colnames(df1)[-1]
names(mycols) <- mycols
df3 <- data.frame(name=df1$name)
df3[mycols]<- lapply(mycols,function(x){
ifelse(df1[,x]=="hello",df2[,x],NA)
})
df3
?
Upvotes: 0
Reputation: 13149
Because you wanted a solution for multiple columns, we first create some data with multiple columns (you could've done this yourself....)
set.seed(4)
nobs=5
df1 <- data.frame(name=paste("name",1:nobs))
df1[,paste0("col",1:5)] <- sample(c("hello",""),5*nobs,T)
# name col1 col2 col3 col4 col5
# 1 name 1 hello hello
# 2 name 2 hello hello
# 3 name 3 hello hello
# 4 name 4 hello hello
# 5 name 5 hello hello
df2 <- data.frame(name=paste("name",1:nobs))
df2[,paste0("col",1:5)] <- 1:nobs
# name col1 col2 col3 col4 col5
# 1 name 1 1 1 1 1 1
# 2 name 2 2 2 2 2 2
# 3 name 3 3 3 3 3 3
# 4 name 4 4 4 4 4 4
# 5 name 5 5 5 5 5 5
Then we create a named vector of columns
mycols <- colnames(df1)[-1]
names(mycols) <- mycols
And make our results
df3 <- data.frame(name=df1$name)
df3[mycols]<- lapply(mycols,function(x){
ifelse(df1[,x]=="hello",df2[,x],NA)
})
name col1 col2 col3 col4 col5
1 name 1 NA 1 NA 1 NA
2 name 2 2 NA 2 NA NA
3 name 3 3 NA 3 NA NA
4 name 4 4 NA NA NA 4
5 name 5 NA 5 5 NA NA
Upvotes: 1