Reputation: 199
ds = Dataset([[1, 1, 1, 2, 2, 2],
["foo", "bar", "monty", "foo", "bar", "monty"],
["a", "b", "c", "d", "e", "f"],
[1, 2, 3, 4, 5, 6]], [:g, :key, :foo, :bar])
In InmemoryDatasets, the transpose function can Pass Tuple of column selectors.
transpose(groupby(ds, :g), (:foo, :bar), id = :key)
Result:
g foo bar monty foo_1 bar_1 monty_1
identity identity identity identity identity identity identity
Int64? String? String? String? Int64? Int64? Int64?
1 1 a b c 1 2 3
2 2 d e f 4 5 6
Question:
How can I do this in DataFrames.jl?
How can I do this in R and Python?
Upvotes: 3
Views: 171
Reputation: 887118
In R
, pivot_wider
can be used for reshaping.
library(tidyr)
pivot_wider(ds, names_from = key, values_from = c(foo, bar))
-output
# A tibble: 2 × 7
g foo_foo foo_bar foo_monty bar_foo bar_bar bar_monty
<dbl> <chr> <chr> <chr> <int> <int> <int>
1 1 a b c 1 2 3
2 2 d e f 4 5 6
If we want to get the same column names, we could rename
the columns
library(dplyr)
library(stringr)
ds %>%
rename("grp"= 'foo', '1' = 'bar') %>%
pivot_wider(names_from = key, values_from = c("grp", `1`),
names_glue = "{key}_{.value}") %>%
rename_with(~ str_remove(.x, "_grp"), ends_with('_grp'))
-output
# A tibble: 2 × 7
g foo bar monty foo_1 bar_1 monty_1
<dbl> <chr> <chr> <chr> <int> <int> <int>
1 1 a b c 1 2 3
2 2 d e f 4 5 6
ds <- structure(list(g = c(1, 1, 1, 2, 2, 2), key = c("foo", "bar",
"monty", "foo", "bar", "monty"), foo = c("a", "b", "c", "d",
"e", "f"), bar = 1:6), class = "data.frame", row.names = c(NA,
-6L))
Upvotes: 5