Rspacer
Rspacer

Reputation: 2429

How to spread a single column based on multiple columns in R?

Each unique year, site, quadrant, and species has two values "Val" in the dataset. I want to spread the values into two columns "Val1" and "Val2". I tried to use the regular spread function but it doesn't seem like its the right fit. Any suggestions?

Year Site Quadrant Species Val
2019   1     1        A    20
2019   1     1        A    30
2019   1     1        B    20
2019   1     1        B    25
2019   1     2        A    20
2019   1     2        A    10
2019   1     2        B    11
2019   1     2        B    22

Desired Output

Year Site Quadrant Species Val1 Val2
2019   1     1        A    20    30
2019   1     1        B    20    25
2019   1     2        A    20    10
2019   1     2        B    11    22

Upvotes: 3

Views: 1014

Answers (3)

mt1022
mt1022

Reputation: 17289

using data.table::dcast and rowid:

library(data.table)
dcast(dtt,
    Year + Site + Quadrant + Species ~ rowid(Year, Site, Quadrant, Species),
    value.var = 'Val')
#    Year Site Quadrant Species  1  2
# 1: 2019    1        1       A 20 30
# 2: 2019    1        1       B 20 25
# 3: 2019    1        2       A 20 10
# 4: 2019    1        2       B 11 22

Similar operation can be done in a tidyverse way, if you prefer:

dtt %>%
    group_by(Year, Site, Quadrant, Species) %>%
    mutate(grp = row_number()) %>%
    pivot_wider(names_from = grp, values_from = Val, names_prefix = 'Val') %>%
    ungroup()

#  A tibble: 4 x 6
#    Year  Site Quadrant Species  Val1  Val2
#   <int> <int>    <int> <chr>   <int> <int>
# 1  2019     1        1 A          20    30
# 2  2019     1        1 B          20    25
# 3  2019     1        2 A          20    10
# 4  2019     1        2 B          11    22

Upvotes: 1

neilfws
neilfws

Reputation: 33782

You can group_by the columns, mutate to make the new column headers and then spread (or pivot_wider):

library(dplyr)

mydata %>% 
  group_by(Year, Site, Quadrant, Species) %>% 
  mutate(Var = paste0("Val", row_number())) %>% 
  spread(Var, Val) %>%
  ungroup()

Result:

# A tibble: 4 x 6
   Year  Site Quadrant Species  Val1  Val2
  <int> <int>    <int> <chr>   <int> <int>
1  2019     1        1 A          20    30
2  2019     1        1 B          20    25
3  2019     1        2 A          20    10
4  2019     1        2 B          11    22

Data:

mydata <- read.table(text = "Year Site Quadrant Species Val
2019   1     1        A    20
2019   1     1        A    30
2019   1     1        B    20
2019   1     1        B    25
2019   1     2        A    20
2019   1     2        A    10
2019   1     2        B    11
2019   1     2        B    22", header = TRUE)

Upvotes: 3

TarJae
TarJae

Reputation: 78927

You can do this this way: with lead

library(tidyverse)
df %>% 
  mutate(id = row_number(),
         Val2 = lead(Val)) %>% 
  filter(id %% 2 == 1) %>% 
  select(-id, Val1  = Val)

Output:

   Year  Site Quadrant Species  Val1  Val2
  <dbl> <dbl>    <dbl> <chr>   <dbl> <dbl>
1  2019     1        1 A          20    30
2  2019     1        1 B          20    25
3  2019     1        2 A          20    10
4  2019     1        2 B          11    22

data:

df <- tribble(
~Year, ~Site, ~Quadrant, ~Species, ~Val,
2019, 1, 1, "A", 20, 
  2019, 1, 1, "A", 30, 
  2019, 1, 1, "B", 20, 
  2019, 1, 1, "B", 25, 
  2019, 1, 2, "A", 20, 
  2019, 1, 2, "A", 10, 
  2019, 1, 2, "B", 11, 
  2019, 1, 2, "B", 22)

Upvotes: 2

Related Questions