Reputation: 207
I have a data frame that I would like to convert the values in one column into new columns, using the data in the adjacent column. Each factor in df$species
would become a new column, and the data for the new column would be the respective data from df$fish_num
however, I'm really confused as to go about how to do this and don't know where to start!
This is my current df:
site treatment section species fish_num
1 Site 1 Control A parr 7
2 Site 1 Control A salmon 6
3 Site 1 Control B trout 4
4 Site 1 Control B salmon 12
5 Site 1 Treatment A parr 8
6 Site 1 Treatment A salmon 5
7 Site 1 Treatment B trout 15
8 Site 1 Treatment B salmon 9
df <- structure(list(site = c("Site 1", "Site 1", "Site 1", "Site 1",
"Site 1", "Site 1", "Site 1", "Site 1"), treatment = c("Control",
"Control", "Control", "Control", "Treatment", "Treatment", "Treatment",
"Treatment"), section = c("A", "A", "B", "B", "A", "A", "B",
"B"), species = c("parr", "salmon", "trout", "salmon", "parr",
"salmon", "trout", "salmon"), fish_num = c(7L, 6L, 4L, 12L, 8L,
5L, 15L, 9L)), class = "data.frame", row.names = c("1", "2",
"3", "4", "5", "6", "7", "8"))
And I would like to be able to produce the following:
site treatment section fish_num parr salmon trout
1 Site 1 Control A 7 7 0 0
2 Site 1 Control A 6 0 6 0
3 Site 1 Control B 4 0 0 4
4 Site 1 Control B 12 0 12 0
5 Site 1 Treatment A 8 8 0 0
6 Site 1 Treatment A 5 0 5 0
7 Site 1 Treatment B 15 0 0 15
8 Site 1 Treatment B 9 0 9 0
I'm unsure as to the best approach!
Upvotes: 0
Views: 382
Reputation: 3888
using tidyverse
's pivot_wider
df %>%
mutate(id=1:nrow(.)) %>%
pivot_wider(names_from = species, values_from=fish_num, id_cols=everything())
#> site treatment section id parr salmon trout
#> <fct> <fct> <fct> <int> <int> <int> <int>
#> 1 Site 1 Control A 1 7 NA NA
#> 2 Site 1 Control A 2 NA 6 NA
#> 3 Site 1 Control B 3 NA NA 4
#> 4 Site 1 Control B 4 NA 12 NA
#> 5 Site 1 Treatment A 5 8 NA NA
#> 6 Site 1 Treatment A 6 NA 5 NA
#> 7 Site 1 Treatment B 7 NA NA 15
#> 8 Site 1 Treatment B 8 NA 9 NA
Upvotes: 1
Reputation: 1411
One way is using pivot_wider()
from tidyr
if you wanna stay within the tidyverse. I also added call to mutate_at()
and mutate()
to replace missing values with zeroes and calculate the column fish_num
.
library(tidyverse)
df %>%
pivot_wider(names_from = species,
values_from = fish_num) %>%
mutate_at(c("parr", "salmon", "trout"), ~replace(., is.na(.), 0)) %>%
mutate(fish_num = parr+salmon+trout)
# A tibble: 4 x 7
site treatment section parr salmon trout fish_num
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Site 1 Control A 7 6 0 13
2 Site 1 Control B 0 12 4 16
3 Site 1 Treatment A 8 5 0 13
4 Site 1 Treatment B 0 9 15 24
Upvotes: 2
Reputation: 101247
A simple base R option is using xtabs
cbind(df,unclass(t(xtabs(fish_num ~species + q,cbind(df,q = 1:nrow(df))))))
which gives
site treatment section species fish_num parr salmon trout
1 Site 1 Control A parr 7 7 0 0
2 Site 1 Control A salmon 6 0 6 0
3 Site 1 Control B trout 4 0 0 4
4 Site 1 Control B salmon 12 0 12 0
5 Site 1 Treatment A parr 8 8 0 0
6 Site 1 Treatment A salmon 5 0 5 0
7 Site 1 Treatment B trout 15 0 0 15
8 Site 1 Treatment B salmon 9 0 9 0
Data
> dput(df)
structure(list(site = c("Site 1", "Site 1", "Site 1", "Site 1",
"Site 1", "Site 1", "Site 1", "Site 1"), treatment = c("Control",
"Control", "Control", "Control", "Treatment", "Treatment", "Treatment",
"Treatment"), section = c("A", "A", "B", "B", "A", "A", "B",
"B"), species = c("parr", "salmon", "trout", "salmon", "parr",
"salmon", "trout", "salmon"), fish_num = c(7L, 6L, 4L, 12L, 8L,
5L, 15L, 9L)), class = "data.frame", row.names = c("1", "2",
"3", "4", "5", "6", "7", "8"))
Upvotes: 1
Reputation: 1277
You can use the trick that TRUE and FALSE in R numerically represent zero and one:
df$salmon <- df$fish_num * (df$species == "salmon")
df$trout <- df$fish_num * (df$species == "trout")
To do this for all occurring species, put it in a loop over levels(df$species)
.
Upvotes: 1