Reputation: 33

How to get max of a column based on other columns in R

Suppose I have the following table,

Name	City	Value
Tom	NY	1
Tom	NY	2
Tom	NY	4
James	NY	1
James	NY	2
Tony	DC	1
Tony	DC	2

I want to find the total value city-wise, however for each Name only their maximum value must be taken. So for this table I should get,

City	Value
NY	6
DC	2

I have tried using various methods using dyplr but none seem to work

Upvotes: 3

Answers (3)

AndrewGB

Reputation: 16876

Here is a data.table approach in one line:

library(data.table)

setDT(df)[df[, .I[which.max(Value)], by=c("Name", "City")]$V1][, sum(Value), by=City]

Output

   City V1
1:   NY  6
2:   DC  2

base R

aggregate(Value ~ City, merge(aggregate(Value ~ Name + City, data=df, max), df, all.x=T), sum)

  City Value
1   DC     2
2   NY     6

Data

df <- structure(list(
  Name = c("Tom", "Tom", "Tom", "James", "James",
           "Tony", "Tony"),
  City = c("NY", "NY", "NY", "NY", "NY", "DC",
           "DC"),
  Value = c(1L, 2L, 4L, 1L, 2L, 1L, 2L)
),
class = "data.frame",
row.names = c(NA, -7L))

Upvotes: 1

Ronak Shah

Reputation: 389315

Here's another approach -

library(dplyr)

df %>%
  arrange(desc(Value)) %>%
  distinct(Name, City, .keep_all = TRUE) %>%
  group_by(City) %>%
  summarise(Value = sum(Value))

#  City  Value
#  <chr> <int>
#1 DC        2
#2 NY        6

Upvotes: 2

Bloxx

Reputation: 1560

Lets assume your dataset is called df1:

library(dplyr)
df1 %>% group_by(City, Name) %>% slice_max(Value, with_ties = FALSE) %>% ungroup() %>% select(City, Value) %>% group_by(City) %>% summarise(Value = sum(Value))

Upvotes: 0

How to get max of a column based on other columns in R

Answers (3)

Related Questions