Remi
Remi

Reputation: 1111

Create a column based on other columns value

I'm looking for a way to create a new column with values based on the value in the first column, with the highest score in the same cluster.

The goal is to move from this:

WORD     | SCORE | Cluster |
diptyqu  | 19750 |    1    |
daptyqu  | 8830  |    1    |
dyptiqu  | 29    |    1    |
deptiqu  | 2     |    1    |
parfume  | 300   |    2    |
parfum   | 14    |    2    |

To this:

WORD     | SCORE | Cluster | Name    |
diptyqu  | 19750 |    1    | diptyqu |
daptyqu  | 8830  |    1    | diptyqu |
dyptiqu  | 29    |    1    | diptyqu |
deptiqu  | 2     |    1    | diptyqu |
parfume  | 300   |    2    | parfume |
parfum   | 14    |    2    | parfume |

I'm litteraly stuck and don't really know how to make it. Do you have any suggestion on a method to do this?

Upvotes: 0

Views: 258

Answers (4)

You can use 'data.table' packages and merge function:

    library(data.table)
    df2 <- setDT(df)[, .SD[which.max(SCORE)], Cluster]
    result <- merge(df, df2[, c('Cluster', 'WORD')], by='Cluster')

Upvotes: 0

d.b
d.b

Reputation: 32538

do.call(rbind,
        lapply(split(dat, dat$Cluster), function(a)
            data.frame(a, Name = a$WORD[which.max(a$SCORE)])))
#       WORD SCORE Cluster    Name
#1.1 diptyqu 19750       1 diptyqu
#1.2 daptyqu  8830       1 diptyqu
#1.3 dyptiqu    29       1 diptyqu
#1.4 deptiqu     2       1 diptyqu
#2.5 parfume   300       2 parfume
#2.6  parfum    14       2 parfume

Upvotes: 1

pogibas
pogibas

Reputation: 28309

Similar strategy as in @Gregor's dplyr solution, but using data.table:

library(data.table)
setDT(d)
d[order(SCORE), Name := tail(WORD, 1), Cluster]

Upvotes: 2

Gregor Thomas
Gregor Thomas

Reputation: 145745

Using dplyr

your_data %>%
  group_by(cluster) %>%
  mutate(Name = first(WORD))

It assumes your rows are already sorted by SCORE. If needed, you can do that sorting too:

your_data %>%
  group_by(cluster) %>%
  arrange(desc(SCORE)) %>%
  mutate(Name = first(WORD))

Upvotes: 2

Related Questions