DanG
DanG

Reputation: 741

Pivot wider with tidyverse

I have a long dateset like this sample data

df <-  tibble::tribble(
      ~V1,    ~V2,      ~V3,
       1L,  "Hig", 3131000L,
       2L,  "Hig",  279000L,
       3L,  "Hig", 1316000L,
       1L, "val", 1882000L,
       2L, "val", 1433000L,
       3L, "val",  555000L,
       4L, "val",  856000L,
       1L,  "nt", 4493000L,
       2L,  "nt",  233000L,
       3L,  "nt",  693000L
      ) 

I want to change this to the wide format without taking into account the V1 variable and cols names. Desired output should be like this (exported as a txt file)

High,3131000,279000,1316000,
val,1882000,1433000,555000,856000,
nt,4493000,233000,693000,

I simply tried this

    df %>% 
      select(V2, V3) %>% 
      pivot_wider("V2",  values_from="V3" ) 

Error: Column 1 must be named.
Use .name_repair to specify repair.

Upvotes: 1

Views: 587

Answers (2)

ktiu
ktiu

Reputation: 2636

Since the lines have differing lengths in your output, I wouldn't work with dplyr's pivot_ functions. They are best used within a tabular format.

Instead you could split your data by V2 and paste that name together with the values from V3. Finally, write the results to a file output.txt like so:

library(purrr)

split(df, ~V2) %>%
  imap_chr(~ paste(c(.y, .x$V3, ""), collapse = ",")) %>%
  cat(file = "output.txt", sep = "\n")

Results in output.txt:

file.show("output.txt")
Hig,3131000,279000,1316000,
nt,4493000,233000,693000,
val,1882000,1433000,555000,856000,

Upvotes: 2

KaptajnKasper
KaptajnKasper

Reputation: 170

The problem is that you need to get the column names from somewhere.

You could do something like this

library(tidyverse)

df <-  tibble::tribble(
  ~V1,    ~V2,      ~V3,
  1L,  "Hig", 3131000L,
  2L,  "Hig",  279000L,
  3L,  "Hig", 1316000L,
  1L, "val", 1882000L,
  2L, "val", 1433000L,
  3L, "val",  555000L,
  4L, "val",  856000L,
  1L,  "nt", 4493000L,
  2L,  "nt",  233000L,
  3L,  "nt",  693000L
) 

df %>%
  select(V2, V3) %>%
  group_by(V2) %>%
  mutate(temp = row_number()) %>%
  pivot_wider(id_cols = V2, names_from=temp, values_from = V3)

which would give you

# A tibble: 3 x 5
# Groups:   V2 [3]
  V2        `1`     `2`     `3`    `4`
  <chr>   <int>   <int>   <int>  <int>
1 Hig   3131000  279000 1316000     NA
2 val   1882000 1433000  555000 856000
3 nt    4493000  233000  693000     NA

Upvotes: 2

Related Questions