Reputation: 47
I have a data frame like this
ID VAR
a 3
b 2
c 6
b 8
z 1
c 5
b 7
and another one that counts the number of times that one ID appears
numb nrec
a 1
b 3
c 2
z 1
What I would like to do is to change the ID name of all the IDs that has just one record, p.e.
ID VAR
-1 3
b 2
c 6
b 8
-1 1
c 5
b 7
Upvotes: 0
Views: 57
Reputation: 61214
Here's an ugly solution
> ind <- as.character(df2$numb[df2$nrec==1])
> df1$ID <- as.character(df1$ID)
> df1$ID[as.character(df1$ID) %in% ind] <- "-1"
> df1
ID VAR
1 -1 3
2 b 2
3 c 6
4 b 8
5 -1 1
6 c 5
7 b 7
If you want ID
to be factor again, then df1$ID <- as.factor(df1$ID)
A better way is using revalue
from plyr package:
library(plyr)
df1$ID <- with(df1, revalue(ID, c("a"="-1", "z"=-1)))
EDIT: a cleaner way using base functions
ind <- as.character(df2$numb[df2$nrec==1])
levels(df1$ID)[levels(df1$ID)==ind] <- "-1"
You can even do it directly using only df1
, no need to use df2
. Using table
and some indexing...
levels(df1$ID)[levels(df1$ID)==with(df1, levels(ID)[table(ID)==1])] <- "-1"
Upvotes: 1