Reputation: 123
I have a following sample code to make one data frame containing information for more than 1 ID. I want to sort them by defined categories. In which I want to see the percentage change at specific (given time for e.h here t=10) with respect to its baseline value and return the value of that found category in output. I have explained detailed step of my calculation below.
a=c(100,105,126,130,150,100,90,76,51,40)
t=c(0,5,10,20,30)
t=rep(t,2)
ID=c(1,1,1,1,1,2,2,2,2,2)
data=data.frame(ID,t,a)
1)for all ID at t=0 "a" value is baseline
2) Computation
e.g At Given t=10 (Have to provide) take corresponding a value
%Change(answer) = (taken a value - baseline/baseline)
3) Compare the answer in the following define CATEGORIES..
#category
1-If answer>0.25
2-If -0.30<answer<0.25
3-If -1.0<answer< -0.30
4-If answer== -1.0
4)Return the value of category
ID My_Answer
1 1
2 3
Can anybody help me in this.I do understand the flow of my computation but not awre of efficient way of doing it as i have so many ID in that data frame. Thank you
Upvotes: 0
Views: 1966
Reputation: 2950
It's easier to do math with columns than with rows. So the first step is to move baseline
numbers into their own columns, then use cut
to define these groups:
library(dplyr)
library(tidyr)
foo <- data %>%
filter(t == 0) %>%
left_join(data %>%
filter(t != 0),
by = "ID") %>%
mutate(percentchange = (a.y - a.x) / a.x,
My_Answer = cut(percentchange, breaks = c(-1, -0.3, 0.25, Inf),
right = FALSE, include.lowest = TRUE, labels = c("g3","g2","g1")),
My_Answer = as.character(My_Answer),
My_Answer = ifelse(percentchange == -1, "g4", My_Answer)) %>%
select(ID, t = t.y, My_Answer)
foo
ID t.x a.x t.y a.y percentchange My_Answer
1 1 0 100 5 105 0.05 g2
2 1 0 100 10 126 0.26 g1
3 1 0 100 20 130 0.30 g1
4 1 0 100 30 150 0.50 g1
5 2 0 100 5 90 -0.10 g2
6 2 0 100 10 76 -0.24 g2
7 2 0 100 20 51 -0.49 g3
8 2 0 100 30 40 -0.60 g3
You can see that this lets us calculate My_Answer
for all values at once. if you want to find out the values for t == 10
, you can just pull out those rows:
foo %>%
filter(t == 10)
ID t My_Answer
1 1 10 g1
2 2 10 g2
Upvotes: 1