Conditional if on a dataframe based on another one

Question

I have a data frame like this

ID      VAR
a       3
b       2
c       6
b       8
z       1
c       5
b       7

and another one that counts the number of times that one ID appears

numb    nrec

a       1
b       3
c       2
z       1

What I would like to do is to change the ID name of all the IDs that has just one record, p.e.

ID      VAR
-1      3
b       2
c       6
b       8
-1      1
c       5
b       7

Jilber Urbina · Accepted Answer

Here's an ugly solution

> ind <- as.character(df2$numb[df2$nrec==1])
> df1$ID <- as.character(df1$ID)
> df1$ID[as.character(df1$ID) %in% ind] <- "-1"
> df1
  ID VAR
1 -1   3
2  b   2
3  c   6
4  b   8
5 -1   1
6  c   5
7  b   7

If you want ID to be factor again, then df1$ID <- as.factor(df1$ID)

A better way is using revalue from plyr package:

library(plyr)
df1$ID <- with(df1, revalue(ID, c("a"="-1", "z"=-1)))

EDIT: a cleaner way using base functions

ind <- as.character(df2$numb[df2$nrec==1])
levels(df1$ID)[levels(df1$ID)==ind] <- "-1"

You can even do it directly using only df1, no need to use df2. Using table and some indexing...

levels(df1$ID)[levels(df1$ID)==with(df1, levels(ID)[table(ID)==1])] <- "-1"

Conditional if on a dataframe based on another one

Answers (1)

Related Questions