Gops71
Gops71

Reputation: 33

How to get max of a column based on other columns in R

Suppose I have the following table,

Name City Value
Tom NY 1
Tom NY 2
Tom NY 4
James NY 1
James NY 2
Tony DC 1
Tony DC 2

I want to find the total value city-wise, however for each Name only their maximum value must be taken. So for this table I should get,

City Value
NY 6
DC 2

I have tried using various methods using dyplr but none seem to work

Upvotes: 3

Views: 3226

Answers (3)

AndrewGB
AndrewGB

Reputation: 16876

Here is a data.table approach in one line:

library(data.table)

setDT(df)[df[, .I[which.max(Value)], by=c("Name", "City")]$V1][, sum(Value), by=City]

Output

   City V1
1:   NY  6
2:   DC  2

base R

aggregate(Value ~ City, merge(aggregate(Value ~ Name + City, data=df, max), df, all.x=T), sum)

  City Value
1   DC     2
2   NY     6

Data

df <- structure(list(
  Name = c("Tom", "Tom", "Tom", "James", "James",
           "Tony", "Tony"),
  City = c("NY", "NY", "NY", "NY", "NY", "DC",
           "DC"),
  Value = c(1L, 2L, 4L, 1L, 2L, 1L, 2L)
),
class = "data.frame",
row.names = c(NA, -7L))

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389315

Here's another approach -

library(dplyr)

df %>%
  arrange(desc(Value)) %>%
  distinct(Name, City, .keep_all = TRUE) %>%
  group_by(City) %>%
  summarise(Value = sum(Value))

#  City  Value
#  <chr> <int>
#1 DC        2
#2 NY        6

Upvotes: 2

Bloxx
Bloxx

Reputation: 1560

Lets assume your dataset is called df1:

library(dplyr)
df1 %>% group_by(City, Name) %>% slice_max(Value, with_ties = FALSE) %>% ungroup() %>% select(City, Value) %>% group_by(City) %>% summarise(Value = sum(Value))

Upvotes: 0

Related Questions