Rasmus Larsen
Rasmus Larsen

Reputation: 6117

Conditional ntiles in tidyverse

I want to calculate ntiles using only a subset of the rows, using tidyverse. The following base R code does what I want:

Base R:

diamonds$conditional_quartiles_var                           <- NA
diamonds$conditional_quartiles_var[ diamonds$price >= 1000 ] <- ntile( diamonds$price[ diamonds$price >= 1000 ], n = 4 )
diamonds$conditional_quartiles_var[ diamonds$price <  1000 ] <- "Less than 1000"

diamonds %>% count(conditional_quartiles_var)

Outputs (what I want):

# A tibble: 5 x 2
  conditional_quartiles_var     n
                      <chr> <int>
1                         1  9861
2                         2  9860
3                         3  9860
4                         4  9860
5            Less than 1000 14499

The above results are what I want, because the ntiles are calculated only from values of price >= 1000.

Tidyverse attempt

My tidyverse implementation fails because the ntiles are calculated from the whole price-vector:

library(tidyverse)


diamonds %>% 
    mutate( wrong_conditional_quartiles_var = case_when(  price >= 1000 ~   ntile(price, n = 4) %>% as.character(),
                                                          price <  1000 ~   "Less than 1000")) %>%
    count( wrong_conditional_quartiles_var)

Outputs (not what I want):

# A tibble: 4 x 2
  wrong_conditional_quartiles_var     n
                            <chr> <int>
1                               2 12471
2                               3 13485
3                               4 13485
4                  Less than 1000 14499

Upvotes: 2

Views: 554

Answers (1)

akrun
akrun

Reputation: 887223

We could use replace

library(dplyr)
diamonds %>%
   mutate(quart = "Less than 1000", 
          quart = replace(quart, price >= 1000, ntile(price[price>=1000], 4))) %>%
   count(quart)
# A tibble: 5 x 2
#           quart     n
#           <chr> <int>
#1              1  9861
#2              2  9860
#3              3  9860
#4              4  9860
#5 Less than 1000 14499

Upvotes: 2

Related Questions