Display name
Display name

Reputation: 4501

Is there an R function to part data based on a user supplied vector?

library(tidyverse)
elec.store <- tibble(computer = c(rep("Dell", 3), rep("HP", 3), rep("Lenovo", 3)),
                sold = c(6, 2, 3, 8, 7, 5, 1, 1, 9))
#> # A tibble: 9 x 2
#>   computer  sold
#>   <chr>    <dbl>
#> 1 Dell         6
#> 2 Dell         2
#> 3 Dell         3
#> 4 HP           8
#> 5 HP           7
#> 6 HP           5
#> 7 Lenovo       1
#> 8 Lenovo       1
#> 9 Lenovo       9

Say I've got my electronics store data frame as shown above. I'd like some type of function that would look something like parting_function(elec.store, c(2, 6)) that would mutate a new column essentially group my data into arbritrary groups (shown below, in this case I chose alphabet letters, but can be anything). The part is after the 2nd row, and the 6th row, if it is not obvious.

Does such a "parting" function exist, if not how would I write the function? This is what I want it to do, without having to manually select alphabet letters and quantities to repeat (e.g. 2, 4, 3, as shown below):

elec.store %>% mutate(grouping = c(rep("A", 2), rep("B", 4), rep("C", 3)))
# A tibble: 9 x 3
#>   computer  sold grouping
#>   <chr>    <dbl> <chr>   
#> 1 Dell         6 A       
#> 2 Dell         2 A       
#> 3 Dell         3 B       
#> 4 HP           8 B       
#> 5 HP           7 B       
#> 6 HP           5 B       
#> 7 Lenovo       1 C       
#> 8 Lenovo       1 C       
#> 9 Lenovo       9 C     

Upvotes: 0

Views: 73

Answers (2)

akrun
akrun

Reputation: 887971

An option would be to create a grouping index from creating a logical index by comparing with row_number, get the cumulative sum, and use that index for changing it to LETTERS (inbuilt vector)

part_vector <- c(2, 6)
elec.store %>% 
    mutate(grouping =  LETTERS[1 + cumsum(row_number() %in% (part_vector + 1))])
# A tibble: 9 x 3
#  computer  sold grouping
#  <chr>    <dbl> <chr>   
#1 Dell         6 A       
#2 Dell         2 A       
#3 Dell         3 B       
#4 HP           8 B       
#5 HP           7 B       
#6 HP           5 B       
#7 Lenovo       1 C       
#8 Lenovo       1 C       
#9 Lenovo       9 C    

Here, LETTERS is used just for the example. If we have more groups, it can be easily created

grp <- c(LETTERS, do.call(paste0, expand.grid(rep(list(LETTERS), 2))))

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389325

We can use cut to break rows at defined intervals in part_vector.

part_vector <- c(2, 6)
elec.store$grouping <- cut(seq_len(nrow(elec.store)),
                        breaks = c(-Inf, part_vector, Inf), 
                        labels = LETTERS[seq_len(length(part_vector) + 1)])



# A tibble: 9 x 3
#  computer  sold grouping
#  <chr>    <dbl> <fct>   
#1 Dell         6 A       
#2 Dell         2 A       
#3 Dell         3 B       
#4 HP           8 B       
#5 HP           7 B       
#6 HP           5 B       
#7 Lenovo       1 C       
#8 Lenovo       1 C       
#9 Lenovo       9 C    

If you want to fit this in dplyr pipes.

library(dplyr)
elec.store %>%
  mutate(grouping = cut(seq_len(n()), 
                     breaks = c(-Inf, part_vector, Inf), 
                     labels = LETTERS[seq_len(length(part_vector) + 1)]))

You could also recreate the same using findInterval

elec.store$grouping <- LETTERS[findInterval(seq_len(nrow(elec.store)),
                       part_vector, left.open = TRUE) + 1]

Upvotes: 3

Related Questions