Drew
Drew

Reputation: 583

using intervals in a column to populate values for another column

I have a dataframe:

dataframe <- data.frame(Condition = rep(c(1,2,3), each = 5, times = 2),
                        Time = sort(sample(1:60, 30)))
     Condition Time
1          1    1
2          1    3
3          1    4
4          1    7
5          1    9
6          2   11
7          2   12
8          2   14
9          2   16
10         2   18
11         3   19
12         3   24
13         3   25
14         3   28
15         3   30
16         1   31
17         1   34
18         1   35
19         1   38
20         1   39
21         2   40
22         2   42
23         2   44
24         2   47
25         2   48
26         3   49
27         3   54
28         3   55
29         3   57
30         3   59

I want to divide the total length of Time (i.e., max(Time) - min(Time)) per Condition by a constant 'x' (e.g., 3). Then I want to use that quotient to add a new variable Trial such that my dataframe looks like this:

     Condition Time Trial
1          1    1     A
2          1    3     A
3          1    4     B
4          1    7     C
5          1    9     C
6          2   11     A
7          2   12     A
8          2   14     B
9          2   16     C
10         2   18     C
... and so on

As you can see, for Condition 1, Trial is populated with unique identifying values (e.g., A, B, C) every 2.67 seconds = 8 (total time) / 3. For Condition 2, Trial is populated every 2.33 seconds = 7 (total time) /3.

I am not getting what I want with my current code:

dataframe %>%
  group_by(Condition) %>%
  mutate(Trial = LETTERS[cut(Time, 3, labels = F)])

# Groups:   Condition [3]
   Condition  Time Trial
       <dbl> <int> <chr>
 1         1     1 A    
 2         1     3 A    
 3         1     4 A    
 4         1     7 A    
 5         1     9 A    
 6         2    11 A    
 7         2    12 A    
 8         2    14 A    
 9         2    16 A    
10         2    18 A    
# ... with 20 more rows

Thanks!

Upvotes: 1

Views: 175

Answers (2)

dash2
dash2

Reputation: 2262

Here's a one-liner using my santoku package. The rleid line is the same as mentioned in @akrun's solution.

    dataframe %<>% 
         group_by(grp = data.table::rleid(Condition)) %>% 
         mutate(
           Trial = chop_evenly(Time, intervals = 3, labels = lbl_seq("A"))
         )

Upvotes: 0

akrun
akrun

Reputation: 886948

We can get the diffrence of range (returns min/max as a vector) and divide by the constant passed into i.e. 3 as the breaks in cut). Then, use integer index (labels = FALSE) to get the corresponding LETTER from the LETTERS builtin R constant

library(dplyr)
dataframe %>% 
    group_by(Condition) %>%
    mutate(Trial = LETTERS[cut(Time, diff(range(Time))/3,
        labels = FALSE)])

If the grouping should be based on adjacent values in 'Condition', use rleid from data.table on the 'Condition' column to create the grouping, and apply the same code as above

library(data.table)
dataframe %>%
    group_by(grp = rleid(Condition)) %>%
     mutate(Trial = LETTERS[cut(Time, diff(range(Time))/3,
        labels = FALSE)])

Upvotes: 1

Related Questions