iayork
iayork

Reputation: 6699

Normalize R dataframe

I'm measuring various parameters for a bunch of individuals over a period of a couple weeks. I want to normalize the values for each individual to that individual's value on day 0, before treatment began. Example data.frame:

id  Day wt  bp
a   0   10  100.00
b   0   15  120.00
c   0   12  150.00
a   1   7.5 120.00
b   1   10  150.00
c   1   8   175.00
b   2   5   110.00
c   2   4   140

So individual "b" needs to have their "wt" and "bp" values from days 1 and 2 compared to their own values from day 0. Some individuals drop out and so the numbers change over the time period. The normalized data.frame would look like this:

id  Day wt      bp
a   0   100.0   100.0
b   0   100.0   100.0
c   0   100.0   100.0
a   1   75.0    120.0
b   1   66.7    125.0
c   1   66.7    116.7
b   2   33.3    91.7
c   2   33.3    93.3

I could do it by stepping through each day/id one at a time, but it seems like there should be a way to do with with one of the apply variants.

Upvotes: 0

Views: 340

Answers (2)

missuse
missuse

Reputation: 19716

Perhaps this way:

library(tidyverse)
df %>%
  group_by(id) %>%
  mutate(wt_norm = wt/wt[Day == 0]*100,
         bp_norm = bp/bp[Day == 0]*100)
#output
# A tibble: 8 x 6
# Groups: id [3]
  id      Day    wt    bp wt_norm bp_norm
  <fct> <int> <dbl> <dbl>   <dbl>   <dbl>
1 a         0 10.0    100   100     100  
2 b         0 15.0    120   100     100  
3 c         0 12.0    150   100     100  
4 a         1  7.50   120    75.0   120  
5 b         1 10.0    150    66.7   125  
6 c         1  8.00   175    66.7   117  
7 b         2  5.00   110    33.3    91.7
8 c         2  4.00   140    33.3    93.3

or as suggested by @C. Braun in the comment:

df %>%
  group_by(id) %>%
  mutate(wt_norm = wt/first(wt)*100,
         bp_norm = bp/first(bp)*100)

or as suggested by @iayork in the comment:

df %>%
  group_by(id) %>%
  transmute_at(vars(-Day),funs(./first(.)*100))

data:

df <- read.table(text = "id  Day wt  bp
           a   0   10  100.00
           b   0   15  120.00
           c   0   12  150.00
           a   1   7.5 120.00
           b   1   10  150.00
           c   1   8   175.00
           b   2   5   110.00
           c   2   4   140", header = T)

Upvotes: 3

DJV
DJV

Reputation: 4863

If you have more columns in you data.frame you can use mutate_at()

require(tidyverse)

set.seed(123)
df <- data.frame(id = rep(c("a","b","c")), 
                 day = c(rep(0, 3), rep(1, 3), rep(2, 3)), 
                 wt = rnorm(9, mean = 10, sd = 2), 
                 bp = rnorm(9, mean = 120, sd = 10))

df %>% 
  group_by(id) %>%
  mutate_at(vars(-day),funs(varNorm = ./first(.)*100))

Upvotes: 1

Related Questions