lll
lll

Reputation: 1109

R: get min value of a column conditional on a categorical variable

I have a dataset that looks like the following:

    Attribute   estimate    
    Proximity   3.7 
    Proximity   1.54    
    Proximity   0.45    
    Waittime    0.7 
    Waittime    0.76    
    service     0.6 
    Knowledge   0.7 

I want to get the max and min value for each attribute. I know that I can get the result from using the following code:

min = fit.leb %>%
 #For each Class
 group_by(Attribute) %>%
 filter(estimate == min(estimate))

But since I have attribute that has only 1 value (i.e. knowledge), for these types of attributes, I want the value returned to me to be 0. Namely, I want a result like the following:

    Attribute   estimate    
    Proximity   0.45    
    Waittime    0.7 
    service     0   
    Knowledge   0

I don't know how to adjust the code I have to fit this extra conditions

Upvotes: 0

Views: 4639

Answers (3)

Curt F.
Curt F.

Reputation: 4824

I like Kara Woo's solution but in case you don't want to define your own function:

fit.leb <- data.frame(Attribute = c('Proximity',
                                    'Proximity',
                                    'Proximity',    
                                    'Waittime', 
                                    'Waittime',     
                                    'service',   
                                    'Knowledge'), 
                      estimate = runif(7)
                      )


fit.leb %>% group_by(Attribute) %>% 
            mutate(count_by_group = n()) %>% 
            mutate(repeated_values = estimate * as.logical((count_by_group - 1))) %>%
            summarize(my_min = min(repeated_values))

Upvotes: 0

Gopala
Gopala

Reputation: 10473

You can use something like this:

df %>% group_by(Attribute) %>% summarise(estimate = ifelse(n() > 1, min(estimate), 0))

Output will be as follows:

Source: local data frame [4 x 2]

  Attribute estimate
     (fctr)    (dbl)
1 Knowledge     0.00
2 Proximity     0.45
3   service     0.00
4  Waittime     0.70

Upvotes: 1

Kara Woo
Kara Woo

Reputation: 3615

Here's a custom function that will return 0 when the length of the data passed to it is 1, and will return the minimum otherwise.

my_min <- function(data) {
  if (length(data) == 1) {
    0
  } else {
    min(data, na.rm = TRUE) # assuming you want to remove NAs
  }
}

You can use it with dpyr::summarize() like so:

fit.leb %>%
  group_by(Attribute) %>%
  summarize(estimate = my_min(estimate))

Upvotes: 1

Related Questions