Split one variable values of data frame according to levels of grouping variable

Question

How can I split one variable values as columns of the data frame according to the levels of the other grouping variable?

Suppose I have a data frame as shown below

Site Species dbh
1    sp1     2.8
1    sp2     2.2
2    sp1     4.0
2    sp2     1.5
3    sp1     3.9
3    sp2     2.5

I want to get the output as below in which the levels of the grouping variable (species) become columns of the data frame and dbh values as values for each level of the grouping variable.

Site sp1  sp2
1    2.8  2.2
2    4.0  1.5
3    3.9  2.5

I would be thankful for your valuable suggestions.

Regards,

Farhan

Vincent · Accepted Answer

This is called a “reshape” or a “pivot”. There are hundreds of tutorials and SO questions out there about it.

dat <- read.table(header = TRUE, text = "
Site Species dbh
1    sp1     2.8
1    sp2     2.2
2    sp1     4.0
2    sp2     1.5
3    sp1     3.9
3    sp2     2.5")

With the tidyverse:

library(tidyr)

dat %>% pivot_wider(values_from = "dbh", names_from = "Species")
#> # A tibble: 3 x 3
#>    Site   sp1   sp2
#>     
#> 1     1   2.8   2.2
#> 2     2   4     1.5
#> 3     3   3.9   2.5

With data.table:

library(data.table)
setDT(dat)

dcast(dat, Site ~ Species)
#> Using 'dbh' as value column. Use 'value.var' to override
#>    Site sp1 sp2
#> 1:    1 2.8 2.2
#> 2:    2 4.0 1.5
#> 3:    3 3.9 2.5

Split one variable values of data frame according to levels of grouping variable

Answers (2)

Related Questions