geodex
geodex

Reputation: 1249

Transposing in dplyr

I have the following data.frame

df = structure(list(HEADER = c("HOME_TRPM", "AWAY_TRPM", "HOME_TEAM","AWAY_TEAM"),
                     price = c("0.863104076023855", "-0.845186446996287","CHA", "NOP")),
               .Names = c("HEADER", "price"), row.names = c(NA, 4L), class = "data.frame")

df
#>      HEADER              price
#> 1 HOME_TRPM  0.863104076023855
#> 2 AWAY_TRPM -0.845186446996287
#> 3 HOME_TEAM                CHA
#> 4 AWAY_TEAM                NOP

which I want to transpose. How can I do it in dplyr without using t()? I tried

df %>% tidyr::spread(HEADER , price)

but it doesn't give a flat structure but instead does this:

structure(list(AWAY_TEAM = c(NA, NA, NA, "NOP"),
     AWAY_TRPM = c(NA, "-0.845186446996287", NA, NA), 
     HOME_TEAM = c(NA, NA, "CHA", NA),
     HOME_TRPM = c("0.863104076023855", NA, NA, NA)),
 .Names = c("AWAY_TEAM", "AWAY_TRPM", "HOME_TEAM", "HOME_TRPM"),
 class = "data.frame", row.names = c(NA, 4L))

The resulting data.frame should be like this:

structure(list(HOME_TRPM = "0.863104076023855",
    AWAY_TRPM = "-0.845186446996287",
    HOME_TEAM = "CHA", 
    AWAY_TEAM = "NOP"), 
.Names = c("HOME_TRPM", "AWAY_TRPM", "HOME_TEAM", "AWAY_TEAM"), 
row.names = c(NA, -1L), class = "data.frame"))

Upvotes: 28

Views: 81027

Answers (5)

llewmills
llewmills

Reputation: 3590

Very ugly but a way around the problem @Ben Bolker mentioned, that allows you to move groups of rows. Say we wanted to move rows for IDs 2 through 4 all at once to after 5

df %>%
  column_to_rownames(var = "ID") %>%
    t %>%
      as.data.frame %>%
        relocate(`2`:`4`,
                 .after = `5`) %>%
          t %>%
            as.data.frame %>%
              rownames_to_column(var = "ID") 

# output
#   ID var1 var2
# 1  1    a    1
# 2  5    e    1
# 3  2    b    1
# 4  3    c    0
# 5  4    d    0

Of course this is just retrofitting the fanastic functionality of relocate to rows. Others who know R better than me may have a neater solution. If only there was a relocate_rows()!

Upvotes: 0

Samuel
Samuel

Reputation: 144

Using as_tibble() function from tibble package, you can eliminate the undesirable effects of t().

df_t = as_tibble(t(df[, -1]))
names(df_t) = df[, 1]

Upvotes: 9

Revan
Revan

Reputation: 2322

spread is retired, tidyr now suggests the usage of pivot_wider():

library(tidyverse)
df %>%
    pivot_wider(names_from = HEADER, values_from = price)

Upvotes: 28

KeelyD
KeelyD

Reputation: 161

They must have updated tidyr since this was originally posted because I think it does what you were originally asking for now:

> library(dplyr)
> library(tidyr)
Warning message:
package ‘tidyr’ was built under R version 3.4.4 
> df
         HEADER              price
    1 HOME_TRPM  0.863104076023855
    2 AWAY_TRPM -0.845186446996287
    3 HOME_TEAM                CHA
    4 AWAY_TEAM                NOP

    > tidyr::spread(df, HEADER, price)
      AWAY_TEAM          AWAY_TRPM HOME_TEAM         HOME_TRPM
    1       NOP -0.845186446996287       CHA 0.863104076023855

If you have a bigger data frame you can always gather and then spread:

> mdf <- data.frame(Things = c("Cookies","Cake","Knives","Kittens", "Politics"), Darkness = sample(1:5), Despair = sample(1:5), Defeat = sample(1:5))> mdf 
    Things Darkness Despair Defeat
1  Cookies        3       4      1
2     Cake        2       2      5
3   Knives        1       3      2
4  Kittens        5       5      3
5 Politics        4       1      4
> mdf %>% tidyr::gather(Idea, Warning_Level, Darkness:Defeat)
     Things     Idea Warning_Level
1   Cookies Darkness             3
2      Cake Darkness             2
3    Knives Darkness             1
4   Kittens Darkness             5
5  Politics Darkness             4
6   Cookies  Despair             4
7      Cake  Despair             2
8    Knives  Despair             3
9   Kittens  Despair             5
10 Politics  Despair             1
11  Cookies   Defeat             1
12     Cake   Defeat             5
13   Knives   Defeat             2
14  Kittens   Defeat             3
15 Politics   Defeat             4
> mdf %>% tidyr::gather(Idea, Warning_Level, Darkness:Defeat) %>% tidyr::spread(Things, Warning_Level)
      Idea Cake Cookies Kittens Knives Politics
1 Darkness    2       3       5      1        4
2   Defeat    5       1       3      2        4
3  Despair    2       4       5      3        1

Upvotes: 5

jeremycg
jeremycg

Reputation: 24945

I think you want tidyr rather than dplyr:

library(tidyr)
library(dplyr)
df %>% mutate(group = 1) %>%
       spread(HEADER, price)

  group AWAY_TEAM          AWAY_TRPM HOME_TEAM         HOME_TRPM
1     1       NOP -0.845186446996287       CHA 0.863104076023855

Using this, you can specify your groupings - and you can add on select(-group) to remove them later.

Upvotes: 29

Related Questions