Reputation: 1860
_aI am trying to transform the dataset. What i want to achieve is that if Intensitat
= 1 - R creates a new column - Intensity 1a - and writes the value of damage_a
. If Intensitat
= 2 - R creates another new column - Intensity 2a - and writes the values of damage_a
. Finally if Intensitat = 3 then R does new column - Intensity 3a - and writes the values of damage_a
. (it always writes the value of damage_x corresponding to the same row as the intensity indicator).
Then I will run the same process: create 3 columns from an indicator variables - each column for an indicator value and in each column use the values of damage_b
.
Lastly the same again but use the values of damage_c
.
type <- sample(seq(from = 1, to = 5, by = 1), size = 50, replace = TRUE)
type
intensity <- sample(seq(from = 1, to = 3, by = 1), size = 50, replace = TRUE)
intensity
damage_a <- sample(seq(from = 1, to = 50, by = 1), size = 50, replace = TRUE)
damage_b <- sample(seq(from = 1, to = 200, by = 1), size = 50, replace = TRUE)
damage_c <- sample(seq(from = 1, to = 500, by = 1), size = 50, replace = TRUE)
dat <- data.frame(type, intensity, damage_a, damage_b, damage_c)
# want to evaluate if the difference in mean between different damages
dat1 <- subset(dat, select = c(type, intensity, damage_a))
dat2 <- subset(dat, select = c(type, intensity, damage_b))
dat3 <- subset(dat, select = c(type, intensity, damage_c))
It might seem like an very easy task for you, but I have been stuck on this for two days. I have tried applyin tidyr function, ifelse comands, etc but never obtained what i needed. Any comments/ideas are very appriciated. Thank you.
EDIT:
I managed to change the format if the table with this little handy function.
df.out <- unstack(dat, damage_a ~ intensity)
The problem now is that i get lists that differ in lenght. How can i combine these lists into ona data frame?
Upvotes: 1
Views: 61
Reputation: 887531
We can use dcast
from data.table
which can take multiple value.var
columns. As there are multiple observations per group, we can use the rowid
function to create a sequence variable by 'type' and 'intensity' in the formula. This will result in a new column 'type_1', but it can be assigned to NULL
later
library(data.table)
dcast(setDT(dat), type + rowid(type, intensity) ~ paste0("intensity_",
intensity), value.var = c('damage_a', 'damage_b', 'damage_c'))[, type_1 := NULL][]
Upvotes: 1
Reputation: 1345
Check out the gather and spread functions in tidyr, e.g.
dat <- tidyr::spread(dat, damage, value, -intensity, -type)
tidyr::spread(dat, damage, value)
Upvotes: 0