Reputation: 4790
Following dataset is reproducible
group <- c(1,1,2,2,3,3)
parameter <- c("A","B","A","B","A","B")
values <- c(10,20,20,5,30,50)
df <- data.frame(group,parameter,values)
group parameter values
1 A 10
1 B 20
2 A 20
2 B 5
3 A 30
3 B 50
I want to check within each group whether A > B (store this result in fourth column for entire group)
If yes -> TRUE, If no -> FALSE
New Df:
group parameter values status
1 A 10 FALSE
1 B 20 FALSE
2 A 20 TRUE
2 B 5 TRUE
3 A 30 FALSE
3 B 50 FALSE
Approach
with(df, ave(values,group, FUN = function(x) ))
I am not able to think what will be the code inside the function. Can someone please help me
Updated: Status should be ranked as per the values column (highest to lowest) per group
group parameter values status
1 A 10 2
1 B 20 1
2 A 20 1
2 B 5 2
3 A 30 2
3 B 50 1
Upvotes: 4
Views: 3100
Reputation: 8072
There is also the tidyverse
solution using dplyr
:
library(dplyr)
df %>%
group_by(group) %>%
mutate(status = ifelse(values[parameter == "A"] > values[parameter == "B"], TRUE, FALSE),
rank = min_rank(-values))
Source: local data frame [6 x 5]
Groups: group [3]
group parameter values status rank
(dbl) (fctr) (dbl) (lgl) (int)
1 1 A 10 FALSE 2
2 1 B 20 FALSE 1
3 2 A 20 TRUE 1
4 2 B 5 TRUE 2
5 3 A 30 FALSE 2
6 3 B 50 FALSE 1
Upvotes: 2
Reputation: 886948
We can try with data.table
. Convert the 'data.frame' to 'data.table' (setDT(df)
), grouped by 'group', compare the 'values' where 'parameter' is 'A' with that of 'B' and assign (:=
) to create 'status'
library(data.table)
setDT(df)[, status := values[parameter=="A"]>values[parameter=="B"], by = group]
df
# group parameter values status
#1: 1 A 10 FALSE
#2: 1 B 20 FALSE
#3: 2 A 20 TRUE
#4: 2 B 5 TRUE
#5: 3 A 30 FALSE
#6: 3 B 50 FALSE
and for the rank
, use frank
on the 'values' after grouping by 'group.
setDT(df)[, status:= frank(-values), group]
df
# group parameter values status
#1: 1 A 10 2
#2: 1 B 20 1
#3: 2 A 20 1
#4: 2 B 5 2
#5: 3 A 30 2
#6: 3 B 50 1
Or with ave
, we can compare the first value with second one (assuming that 'parameter' is ordered and also only two elements per 'group'
df$status <- with(df, as.logical(ave(values, group, FUN = function(x) x[1] > x[2])))
Or another option is to order
the dataset by the first columns (in case it is not ordered), the subset the 'values' by the recycling of logical index, compare and replicate each of the logical values by 2.
df1 <- df[do.call(order, df[1:2]), ]
rep(df1$values[c(TRUE, FALSE)] > df1$values[c(FALSE, TRUE)], each = 2)
Upvotes: 4