Hardik Gupta
Hardik Gupta

Reputation: 4790

Compare each row within each group

Following dataset is reproducible

group <- c(1,1,2,2,3,3)
parameter <- c("A","B","A","B","A","B")
values <- c(10,20,20,5,30,50)
df <- data.frame(group,parameter,values)

group parameter values
    1         A     10
    1         B     20
    2         A     20
    2         B      5
    3         A     30
    3         B     50

I want to check within each group whether A > B (store this result in fourth column for entire group)

If yes -> TRUE, If no -> FALSE

New Df:

group parameter values  status
    1         A     10      FALSE
    1         B     20      FALSE
    2         A     20      TRUE
    2         B      5      TRUE
    3         A     30      FALSE
    3         B     50      FALSE

Approach

with(df, ave(values,group, FUN = function(x) ))

I am not able to think what will be the code inside the function. Can someone please help me

Updated: Status should be ranked as per the values column (highest to lowest) per group

group parameter values  status
    1         A     10      2
    1         B     20      1
    2         A     20      1
    2         B      5      2
    3         A     30      2
    3         B     50      1

Upvotes: 4

Views: 3100

Answers (2)

Jake Kaupp
Jake Kaupp

Reputation: 8072

There is also the tidyverse solution using dplyr:

    library(dplyr)

    df %>% 
      group_by(group) %>% 
      mutate(status = ifelse(values[parameter == "A"] > values[parameter == "B"], TRUE, FALSE),
             rank = min_rank(-values))

Source: local data frame [6 x 5]
Groups: group [3]

  group parameter values status  rank
  (dbl)    (fctr)  (dbl)  (lgl) (int)
1     1         A     10  FALSE     2
2     1         B     20  FALSE     1
3     2         A     20   TRUE     1
4     2         B      5   TRUE     2
5     3         A     30  FALSE     2
6     3         B     50  FALSE     1

Upvotes: 2

akrun
akrun

Reputation: 886948

We can try with data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'group', compare the 'values' where 'parameter' is 'A' with that of 'B' and assign (:=) to create 'status'

library(data.table)
setDT(df)[, status := values[parameter=="A"]>values[parameter=="B"], by = group]
df
#   group parameter values status
#1:     1         A     10  FALSE
#2:     1         B     20  FALSE
#3:     2         A     20   TRUE
#4:     2         B      5   TRUE
#5:     3         A     30  FALSE
#6:     3         B     50  FALSE

and for the rank, use frank on the 'values' after grouping by 'group.

setDT(df)[, status:= frank(-values), group]
df
#   group parameter values status
#1:     1         A     10      2
#2:     1         B     20      1
#3:     2         A     20      1
#4:     2         B      5      2
#5:     3         A     30      2
#6:     3         B     50      1

Or with ave, we can compare the first value with second one (assuming that 'parameter' is ordered and also only two elements per 'group'

df$status <- with(df, as.logical(ave(values, group, FUN = function(x) x[1] > x[2])))

Or another option is to order the dataset by the first columns (in case it is not ordered), the subset the 'values' by the recycling of logical index, compare and replicate each of the logical values by 2.

df1 <- df[do.call(order, df[1:2]), ]
rep(df1$values[c(TRUE, FALSE)] > df1$values[c(FALSE, TRUE)], each = 2)

Upvotes: 4

Related Questions