Reputation: 1
Sample data:
Group <- c("a", "a", "a", "b", "b", "b", "c", "c", "c")
value_1 <- c(1.10, 2.5, 1.7, 0.99, 1.50, 1.65, 2.5, 2.5, 1.5)
value_2 <- c(0.03, 1.3, 3.5, 0.02, 4.3, 1.2, 1.4, 1.4, 3.7)
new_variable_1 <- c(1,0,1, 1,1,0, 0,0,1)
df <- data.frame(Group, value_1, value_2, new_variable_1)
The output is new_variable_1
. I want to create a new_variable_1
based on following criteria; I am seeking 2 solutions.
Basic idea:
lookup the max value in
value_2
by group and create dummy variable based on values invalue_1
.
Find max(value_2)
by group. E.g., the max value in value_2
for group a
is 3.5
Find the corresponding value_1
by group. E.g., value_1
is 1.7
in group a
create new_variable_1
by group that is 1
if value_1
is less than the corresponding value in the above step. E.g., for group a
, value_1 <= 1.7
should show 1
& value_1 > 1.7
should show 0
.
Same as above, but increase the threshold value from step 2 by 10%.
the max value in value_2
for group a
is 3.5
it then corresponds to value 1.7
value_1
in group a
Increase the value by 10%
. For group a 10%
in increase would be 1.87
.
Create new_variable_1
: for group a, value_1 <= 1.87
should show 1
& value_1 > 1.87
should show 0
.
R, dplyr
, data.table
and most efficient R codes are welcome.
It's a large dataset so groups may have different length and Inf
or NA
may exist in value_2
.
Upvotes: 0
Views: 929
Reputation: 13309
We could try. I've used names starting with "New" to make it easier to follow.
Solution 1(Thanks to @Gregor):
library(dplyr)
df %>%
group_by(Group) %>%
mutate(New_variable_1 = ifelse(value_1 <= value_1[which.max(value_2)], 1, 0))
Solution 2: Thanks to @Gregor
df %>%
group_by(Group) %>%
mutate(New_variable_1 = ifelse(value_1 <= value_1[which.max(value_2)], 1, 0),
NewVar1=value_1[which.max(value_2)]*1.1)
Upvotes: 1