PatraoPedro
PatraoPedro

Reputation: 197

Return value using two conditions

I’m trying to return a value from a column in a dataframe based on two columns. So here is an example:

 year<-c(rep("2010",3),rep("2011",3),rep("2012",3))
 var1<-c(rep(c("aaa","bbb","ccc"),3))
 var2<-c(0.5,0.45,0.1,0.3,0.23,0.58,0.8,0.75,0.44)

 df1<-data.frame(year,var1,var2)


 ID<-c("z1","z2","z3","z4","z5")
 year<-c(2011,2011,2012,2010,2010)
 var3<-c("aaa","aaa","ccc","bbb","ccc")

 df2<-data.frame(ID,year,var3)

If I had only one condition I would do something like this:

df2$rate<-df1$var2[match(df2$year,df1$year)]

The problem is that I have two and I've tried a few thing such as:

df2$rate<-df1$var2[match(df2$year,df1$year)& match(df2$var3,df1$var1)]

But unfortunatly non of my attempts worked. Is there any simple way of returning the value based on two conditions?

For this particular case I could do merge the variable and year into one column and apply the previous method but I was looking for a more elegant solution.

 df1$join<-paste(df1$var1,df1$year,sep="")
 df2$join2<-paste(df2$var3,df2$year,sep="")

 df2$rate<-df1$var2[match(df2$join2,df1$join)]

Thanks in advance for any sugestion

Upvotes: 1

Views: 135

Answers (1)

akrun
akrun

Reputation: 887501

You could try merge

 merge(df2, df1, by.x=c("year", "var3"), by.y=c("year", "var1"), sort=FALSE)
 #  year var3 ID var2
 #1 2011  aaa z1 0.30
 #2 2011  aaa z2 0.30
 #3 2012  ccc z3 0.44
 #4 2010  bbb z4 0.45
 #5 2010  ccc z5 0.10

Or you could use left_join from dplyr

 library(dplyr)
 colnames(df2)[3] <- "var1"

The year column is factor. Change it to character class or numeric. You could have used stringsAsFactors=FALSE in the data.frame(..)

   df1$year <- as.character(df1$year)
   df2$year <- as.character(df2$year)
   left_join(df2,df1)
   #Joining by: c("year", "var1")
   #ID year var1 var2
  #1 z1 2011  aaa 0.30
  #2 z2 2011  aaa 0.30
  #3 z3 2012  ccc 0.44
  #4 z4 2010  bbb 0.45
  #5 z5 2010  ccc 0.10

Upvotes: 1

Related Questions