Reputation: 85
I've used the dplyr package to summarize some data. The dataframe I produced looks something like this:
Iteration Degree Proportion
1 0 .5
1 30 .7
1 60 .8
2 0 .6
2 30 .9
3 0 .3
3 30 .8
3 60 .8
I would like to transform my dataframe into a new dataframe where the each of the 3 degree conditions are their own column, and the corresponding proportion values are filled in. MOST IMPORTANTLY I need to impute 'NA' values whenever an iteration does not have a degree value.
The dataframe I am thinking of would look something like this:
Iteration 0_Degree 30_Degree 60_Degree
1 .5 .7 .8
2 .6 .9 NA
3 .3 .8 .8
Identifying where NAs need to be filled in is the major challenge I am running at the moment.
Does anyone have an idea for how I might accomplish this?
Thank you!
Upvotes: 2
Views: 248
Reputation: 36
This can easily be achieved with the spread
function from the tidyr
package. tidyr
is part of the the tidyverse
.
Simply use:
library(tidyverse)
df %>%
spread(key = Degree, value = Proportion)
The default option for filling missing observations is fill = 'NA'
.
Upvotes: 2
Reputation: 269501
Omit as.data.frame
if you don't need it as a data frame. No packages are used.
as.data.frame(tapply(dd[[3]], dd[-3], c))
giving:
0 30 60
1 0.5 0.7 0.8
2 0.6 0.9 NA
3 0.3 0.8 0.8
The input in reproducible form is:
dd <- structure(list(Iteration = c(1L, 1L, 1L, 2L, 2L, 3L, 3L, 3L),
Degree = c(0L, 30L, 60L, 0L, 30L, 0L, 30L, 60L), Proportion = c(0.5,
0.7, 0.8, 0.6, 0.9, 0.3, 0.8, 0.8)), class = "data.frame", row.names = c(NA,
-8L))
Upvotes: 1