Reputation: 2984
I have a data.frame in R, like this:
fruits
X1 X2 X3
aa kiwi 15
ba orange 25
cc lemon 23
ba apple 17
cc lemon 19
cc orange 18
cc orange 21
ba banana 17
I'd like to replace all values in column X2 except "orange" and "lemon" with "other". How to do it in R?
Example data:
fruits <- structure(list(X1 = structure(c(1L, 2L, 3L, 2L, 3L, 3L, 3L, 2L
), .Label = c("aa", "ba", "cc"), class = "factor"), X2 = structure(c(3L,
5L, 4L, 1L, 4L, 5L, 5L, 2L), .Label = c("apple", "banana", "kiwi",
"lemon", "orange"), class = "factor"), X3 = c(15L, 25L, 23L,
17L, 19L, 18L, 21L, 17L)), .Names = c("X1", "X2", "X3"), class = "data.frame", row.names = c(NA,
-8L))
Upvotes: 2
Views: 9528
Reputation: 174898
An easy way is to coerce the factor to a character vector, then identify which elements are not in the required classes and replace them with "other"
, and finally coerce back to a factor.
There are two variations on this theme, the first using the replace()
function:
transform(fruits,
X2 = factor(replace(as.character(X2),
list = !X2 %in% c("orange","lemon"),
values = "other")))
which gives:
> transform(fruits, X2 = factor(replace(as.character(X2),
+ list = !X2 %in% c("orange","lemon"),
+ values = "other")))
X1 X2 X3
1 aa other 15
2 ba orange 25
3 cc lemon 23
4 ba other 17
5 cc lemon 19
6 cc orange 18
7 cc orange 21
8 ba other 17
Or you can do it by hand:
fruits <- transform(fruits,
X2 = {x <- as.character(X2)
x[!x %in% c("orange","lemon")] <- "other"
factor(x)})
> fruits
X1 X2 X3
1 aa other 15
2 ba orange 25
3 cc lemon 23
4 ba other 17
5 cc lemon 19
6 cc orange 18
7 cc orange 21
8 ba other 17
I use transform()
here so that we do the manipulation inside an environment where X2
is visible without having to use things like fruits$X2
which gets tedious to type out.
Upvotes: 2
Reputation: 60492
What about:
R> fruits = data.frame(X1 = 1:3, X2 = c("kiwi", "orange", "lemon"))
R> fruits$X2 = as.character(fruits$X2)
R> fruits[!(fruits$X2 %in% c("lemon", "orange")),]$X2 = "Other"
R> fruits
X1 X2
1 1 Other
2 2 orange
3 3 lemon
In the above solution, I converted the factors to "characters". You don't have to do this, you can also:
read.csv
, use the stringsAsFactorsYou work with factors directly:
R> fruits$X2 = factor(fruits$X2, levels = c(as.character(fruits$X2), "Other"))
R> fruits[!(fruits$X2 %in% c("lemon", "orange")),]$X2 = "Other"
R> fruits
X1 X2
1 1 Other
2 2 orange
3 3 lemon
Notice that I extend the levels of the first factor in line 1.
Upvotes: 1
Reputation: 11946
First create a variable indicating the rows to be altered. You can do this e.g. like this:
shouldBecomeOther<-!(fruits$X2 %in% c("orange", "lemon"))
Then use that indexer:
fruits$X2[shouldBecomeOther]<- "other"
Note that if the column is a factor (highly likely), it will take some more work, like this:
tmp<-as.character(fruits$x2)
tmp[shouldBecomeOther]<-"other"
fruits$x2<-factor(tmp)
Upvotes: 5