Reputation: 11
Currently I have a following- two matrices A and B. In matrix A the first row corresponds the investment type and the second row is number of scenario (1,2,3) for both investment types bonds and stocks:
A =["bond" "bond" "bond" "stock" "stock" "stock"; 1 2 3 1 2 3; 0.1 0.12 0.13 0.14 0.15 0.16; 0.17 0.18 0.19 0.2 0.21 0.22]
In matrix B I have two values for both bonds and stocks, the past values.
B= ["bond" "stock"; 0.31 0.32; 0.34 0.35]
My data consists of this similar type of data matrix with multiple investment types and >1000 scenarios. How can I merge these two matrices to create following output for the large dataset:
merged_matrix = ["bond" "bond" "bond" "stock" "stock" "stock";1 2 3 1 2 3; 0.31 0.31 0.31 0.32 0.32 0.32; 0.34 0.34 0.34 0.35 0.35 0.35; 0.1 0.12 0.13 0.14 0.15 0.16; 0.17 0.18 0.19 0.2 0.21 0.22]
I have difficulties in generating the right for-loop iteration. I am unsure how to use the investment type as an identifying key. I have tried to format these to DataFrame, but haven't found any suitable solution either. Any help is highly appreciated!
Upvotes: 1
Views: 67
Reputation: 69829
Per the comment here is how you can do it in DataFrames.jl (I have given some arbitrary names to the columns - probably you can change it to something more meaningful for your case):
julia> using DataFrames
julia> dfA = identity.(DataFrame(permutedims(A), ["type", "v1", "v2", "v3"]))
6×4 DataFrame
Row │ type v1 v2 v3
│ String Int64 Float64 Float64
─────┼─────────────────────────────────
1 │ bond 1 0.1 0.17
2 │ bond 2 0.12 0.18
3 │ bond 3 0.13 0.19
4 │ stock 1 0.14 0.2
5 │ stock 2 0.15 0.21
6 │ stock 3 0.16 0.22
julia> dfB = identity.(DataFrame(permutedims(B), ["type", "v4", "v5"]))
2×3 DataFrame
Row │ type v4 v5
│ String Float64 Float64
─────┼──────────────────────────
1 │ bond 0.31 0.34
2 │ stock 0.32 0.35
julia> rightjoin(dfB, dfA, on="type")
6×6 DataFrame
Row │ type v4 v5 v1 v2 v3
│ String Float64? Float64? Int64 Float64 Float64
─────┼─────────────────────────────────────────────────────
1 │ bond 0.31 0.34 1 0.1 0.17
2 │ bond 0.31 0.34 2 0.12 0.18
3 │ bond 0.31 0.34 3 0.13 0.19
4 │ stock 0.32 0.35 1 0.14 0.2
5 │ stock 0.32 0.35 2 0.15 0.21
6 │ stock 0.32 0.35 3 0.16 0.22
Note that I use identity.(...)
operation to make column types concrete (as your original data has Any
type which is inefficient as was highlighted in the comment).
Upvotes: 2