Reputation: 4065
Looking for a function that works like by
but doesn't collapse my DataFrame. In R I would use dplyr
's groupby(b) %>% mutate(x1 = sum(a))
. I don't want to lose information from the table such as that in variable :c
.
mydf = DataFrame(a = 1:4, b = repeat(1:2,2), c=4:-1:1)
bypreserve(mydf, :b, x -> sum(x.a))
│ Row │ a │ b │ c │ x1
│ │ Int64 │ Int64 │ Int64 │Int64
├─────┼───────┼───────┼───────┤───────
│ 1 │ 1 │ 1 │ 4 │ 4
│ 2 │ 2 │ 2 │ 3 │ 6
│ 3 │ 3 │ 1 │ 2 │ 4
│ 4 │ 4 │ 2 │ 1 │ 6
Upvotes: 3
Views: 735
Reputation: 69949
Adding this functionality is discussed, but I would say that it will take several months to be shipped (the general idea is to allow select
to have groupby
keyword argument + also add transform
function that will work like select
but preserve columns of the source data frame).
For now the solution is to use join
after by
:
join(mydf, by(mydf, :b, x1 = :a => sum), on=:b)
Upvotes: 4