Sylvie
Sylvie

Reputation: 1

How to summarise data based on median, creating fast and slow columns in R

I have a dataframe with multiple ids, all of whom have three conidtions, and corresponding data points (ReacTime).

|ID|Condition|ReacTime|
|1 | Cong    |537     |
|1 | Incong  |541     |
|1 | Cong    |500     |
|1 | Cong    |520     |
|1 | Incong  |537     |
|1 | Cong    |599     |
|2 | Cong    |650     |
|2 | Incong  |708     |
|2 | Cont    |672     |
|2 | Cong    |676     |
|2 | Incong  |822     |
|2 | Cont    |609     |
|3 | Cong    |630     |
|3 | Incong  |725     |
|3 | Cont    |680     |
|3 | Cong    |625     |
|3 | Incong  |700     |
|3 | Cont    |620     |

I found the median for each ID's ReacTime, and now I have to get a slow and fast value for each ID. Average all values for each condition before the median (slow) and average all values post the median (fast).

I used the summarize function for the median value:

Df2<- summarise(group_by(Df1, ID),medianvalue = median(ReacTime))

For the fast and slow I tried quantiles:

 Df2 <- summarise(group_by(Df2, ID,Condition), 
                            Slow = quantile(ReacTime, probs = 0.5), 
                            Fast = quantile(ReacTime, probs = ?).

I am not sure what to put for my fast probs?

Upvotes: 0

Views: 165

Answers (2)

akrun
akrun

Reputation: 887088

Using data.table

library(data.table)
setDT(df)[, {
             v1 <- median(ReacTime)
             .(medianvalue = v1, Slow = mean(ReacTime[ReacTime < v1]),
               Fast = mean(ReacTime[ReacTime > v1]))
           }, .(ID)]

-output

ID medianvalue     Slow     Fast
1:  1         537 510.0000 570.0000
2:  2         674 643.6667 735.3333
3:  3         655 625.0000 701.6667

data

df <- structure(list(ID = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), Condition = c("Cong", "Incong", 
"Cong", "Cong", "Incong", "Cong", "Cong", "Incong", "Cont", "Cong", 
"Incong", "Cont", "Cong", "Incong", "Cont", "Cong", "Incong", 
"Cont"), ReacTime = c(537L, 541L, 500L, 520L, 537L, 599L, 650L, 
708L, 672L, 676L, 822L, 609L, 630L, 725L, 680L, 625L, 700L, 620L
)), class = "data.frame", row.names = c(NA, -18L))

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388972

You can calculate this in the same summarise code -

library(dplyr)

df %>%
  group_by(ID) %>%
  summarise(medianvalue = median(ReacTime), 
            Slow = mean(ReacTime[ReacTime < medianvalue]), 
            Fast = mean(ReacTime[ReacTime > medianvalue]))

#     ID medianvalue  Slow  Fast
#  <int>       <dbl> <dbl> <dbl>
#1     1         537  510   570 
#2     2         674  644.  735.
#3     3         655  625   702.

Upvotes: 2

Related Questions