Amnor
Amnor

Reputation: 380

R reassign values from a column depending on the frequency

I'm tryng to get the column "names" from my dataframe, and change the names with lesser frequency to "others" in order to simplify a later Java program. For example:

someValue   Names
1           Ramon
2           Alex
4           Ramon
1           Luke
2           Han
3           Leia
4           Luke
8           Ramon
20          Luke

Now, the names with less than 3 frequency have to become others:

someValue   Names
1           Ramon
2           Others
4           Ramon
1           Luke
2           Others
3           Others
4           Luke
8           Ramon
20          Luke

And I am a little lost with this, I hope anyone knows a quick way to do this, thanks in advance!

Upvotes: 0

Views: 45

Answers (2)

rnso
rnso

Reputation: 24613

Following one-liner also works:

> ddf$Names = ifelse(ddf$someValue<3, 'Others', ddf$Names)

or:

> ddf$Names = with(ddf, ifelse(someValue<3, 'Others', Names))

> ddf
  someValue  Names
1         1 Others
2         2 Others
3         4  Ramon
4         1 Others
5         2 Others
6         3   Leia
7         4   Luke
8         8  Ramon
9        20   Luke

Just make sure that the Names column is 'character' and not 'factor'. If factor, it can be changed with as.character(ddf$Names).

Upvotes: 1

Jasper
Jasper

Reputation: 555

You can use the table function to calculate the frequencies, and then find the ones whose frequencies are too low.
An example using character strings:

set.seed(123)
df <- data.frame(
    someValue = 1:50,
    Names = sample(LETTERS, 50, TRUE),
    stringsAsFactors = FALSE
)
n.tab <- table( df$Names )
n.many <- names( n.tab[ n.tab > 3] )
df[ !(df$Names %in% n.many), "Names"] <- "Others"
df

Or the same example, but with a factor:

set.seed(123)
df <- data.frame(
    someValue = 1:50,
    Names = sample(LETTERS, 50, TRUE)
)
n.tab <- table( df$Names )
n.many <- names( n.tab[ n.tab > 3] )

levels(df$Names)[ !(levels(df$Names) %in% n.many) ] <- "Others"
df

Upvotes: 2

Related Questions