Warwick Wang
Warwick Wang

Reputation: 199

Julia transpose grouped data Passing Tuple of column selectors

ds = Dataset([[1, 1, 1, 2, 2, 2],
                        ["foo", "bar", "monty", "foo", "bar", "monty"],
                        ["a", "b", "c", "d", "e", "f"],
                        [1, 2, 3, 4, 5, 6]], [:g, :key, :foo, :bar])

In InmemoryDatasets, the transpose function can Pass Tuple of column selectors.

transpose(groupby(ds, :g), (:foo, :bar), id = :key)
Result:

g   foo bar monty   foo_1   bar_1   monty_1
identity    identity    identity    identity    identity    identity    identity
Int64?  String? String? String? Int64?  Int64?  Int64?
1   1   a   b   c   1   2   3
2   2   d   e   f   4   5   6

Question:

How can I do this in DataFrames.jl?

How can I do this in R and Python?

Upvotes: 3

Views: 171

Answers (1)

akrun
akrun

Reputation: 887118

In R, pivot_wider can be used for reshaping.

library(tidyr)
pivot_wider(ds, names_from = key, values_from = c(foo, bar))

-output

# A tibble: 2 × 7
      g foo_foo foo_bar foo_monty bar_foo bar_bar bar_monty
  <dbl> <chr>   <chr>   <chr>       <int>   <int>     <int>
1     1 a       b       c               1       2         3
2     2 d       e       f               4       5         6

If we want to get the same column names, we could rename the columns

library(dplyr)
library(stringr)
 ds %>% 
  rename("grp"= 'foo', '1' = 'bar') %>% 
  pivot_wider(names_from = key, values_from = c("grp", `1`), 
      names_glue = "{key}_{.value}") %>% 
  rename_with(~ str_remove(.x, "_grp"), ends_with('_grp'))

-output

# A tibble: 2 × 7
      g foo   bar   monty foo_1 bar_1 monty_1
  <dbl> <chr> <chr> <chr> <int> <int>   <int>
1     1 a     b     c         1     2       3
2     2 d     e     f         4     5       6

data

ds <- structure(list(g = c(1, 1, 1, 2, 2, 2), key = c("foo", "bar", 
"monty", "foo", "bar", "monty"), foo = c("a", "b", "c", "d", 
"e", "f"), bar = 1:6), class = "data.frame", row.names = c(NA, 
-6L))

Upvotes: 5

Related Questions