Reputation: 73

How to group by data in one column into multiple columns keeping rows

Identifier	Value
511016	75.72911
511016	79.01783
511016	74.87570
511029	72.75873
511029	74.41798
511029	78.56112

The dataset consists of two columns (as above) but 77,000 rows. The first column is the 'name' vector and the second the value. I need to transform the data so that the first column has just one value for the identifier and after this, the columns take all the values that the respective identifier has. Like this:

Identifier	Value 1	Value 2	etc...
511016	75.72911	79.01783
511029	72.75873	74.41798

I have been able to use group_keys to produce a single column with the identifiers but cannot get the values to match in this way. Any help is appreciated.

Upvotes: 3

Answers (3)

Julian

Reputation: 9240

library(dplyr)
library(tidyr)

df <- tribble(~Identifier,  ~Value,
511016,     75.72911,
511016,     79.01783,
511016,     74.87570,
511029,     72.75873,
511029,     74.41798,
511029,     78.56112
) 


df %>% 
  group_by(Identifier) %>% 
  mutate(row_id = row_number()) %>% 
  pivot_wider(id_cols = "Identifier",  names_from = row_id,
              values_from = Value, 
              names_glue = "{.value}{row_id}") %>% 
  ungroup()

Upvotes: 4

TarJae

Reputation: 78917

Assuming your df has two columns, One similar Option is:

library(dplyr)
library(tidyr)

df %>% 
  group_by(Identifier) %>% 
  mutate(name = paste(colnames(df[2]), row_number())) %>% 
  pivot_wider(names_from = name, values_from = Value)

  Identifier `Value 1` `Value 2` `Value 3`
       <int>     <dbl>     <dbl>     <dbl>
1     511016      75.7      79.0      74.9
2     511029      72.8      74.4      78.6

Upvotes: 1

DaveArmstrong

Reputation: 21757

This should do it:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidyr)
dat <- structure(list(Identifier = c(511016L, 511016L, 511016L, 511029L, 
                              511029L, 511029L), 
               Value = c(75.72911, 79.01783, 74.8757, 72.75873, 
                         74.41798, 78.56112)), row.names = c(NA, 6L), class = "data.frame")
dat %>% 
  group_by(Identifier) %>% 
  mutate(obs = row_number()) %>% 
  pivot_wider(names_from = "obs", 
                    values_from = "Value", 
                    names_prefix="value")
#> # A tibble: 2 × 4
#> # Groups:   Identifier [2]
#>   Identifier value1 value2 value3
#>        <int>  <dbl>  <dbl>  <dbl>
#> 1     511016   75.7   79.0   74.9
#> 2     511029   72.8   74.4   78.6

^{Created on 2022-04-29 by the reprex package (v2.0.1)}

Upvotes: 3

How to group by data in one column into multiple columns keeping rows

Answers (3)

Related Questions