Reputation: 794
I have this dataframe:
Text feat1 feat2 feat3 feat4
string1 1 1 0 0
string2 0 0 0 1
string3 0 0 0 0
I want to create 2 other columns this way:
Text feat1 feat2 feat3 feat4 all_feat count_feat
string1 1 1 0 0 ["feat1","feat2"] 2
string2 0 0 0 1 ["feat4"] 1
string3 0 0 0 0 [] 0
What's the best approach to do it in Julia?
Upvotes: 2
Views: 103
Reputation: 69949
Here is one of possible ways to do it:
julia> df
3×5 DataFrame
Row │ Text feat1 feat2 feat3 feat4
│ String Int64 Int64 Int64 Int64
─────┼─────────────────────────────────────
1 │ string1 1 1 0 0
2 │ string2 0 0 0 1
3 │ string3 0 0 0 0
julia> transform(df,
AsTable(r"feat") =>
ByRow(x -> [string(k) for (k,v) in pairs(x) if v == 1]) =>
:all_feat,
r"feat" => (+) => :count_feat)
3×7 DataFrame
Row │ Text feat1 feat2 feat3 feat4 all_feat count_feat
│ String Int64 Int64 Int64 Int64 Array… Int64
─────┼─────────────────────────────────────────────────────────────────────
1 │ string1 1 1 0 0 ["feat1", "feat2"] 2
2 │ string2 0 0 0 1 ["feat4"] 1
3 │ string3 0 0 0 0 String[] 0
Upvotes: 4