Reputation: 16107
For example, consider the following data
> sample.df
f1 f2 x1 x2 x3
1 2 2 7.28 9.40 5.02
2 1 1 6.30 9.56 3.74
3 2 1 6.88 8.72 3.14
4 1 2 6.68 9.58 3.84
I wonder how to write MAGIC
so that
> sample.matrix <- MAGIC(sample.df)
> sample.matrix[1, 1, ]
[1] 6.30 9.56 3.74
> sample.matrix[1, 2, ]
[1] 6.68 9.58 3.84
Basically, sample.matrix[x, y, ]
selects the row in the data frame with sample.df[sample.df$f1 == x & sample.df$f2 == y, ]
, and then remove the redundant columns indicating the value of f1
and f2
. Note that each combination of (f1, f2)
appears and appears only once in the data frame.
My first thought was as.matrix
followed by a dim<-
, but the rows in the data frame are not sorted. Sorting it would take O(n * log(n)), but I just want to create a table, so theoretically the time complexity could be bound by O(n).
It would be better if you could exploit vectorization, if possible.
Upvotes: 4
Views: 117
Reputation: 388962
EDIT
After re-reading the question again I think we can use split
without order
ing to avoid the sorting step. Since f1
and f2
are unique for every row, we can do
split(sample.df[, -(1:2)], list(sample.df$f1, sample.df$f2))
#$`1.1`
# x1 x2 x3
#2 6.3 9.56 3.74
#$`2.1`
# x1 x2 x3
#3 6.88 8.72 3.14
#$`1.2`
# x1 x2 x3
#4 6.68 9.58 3.84
#$`2.2`
# x1 x2 x3
#1 7.28 9.4 5.02
Original Answer
I am not exactly clear about the goal but one way is to order
sample.df
by f1
, f2
and then subset using Map
new_df <- sample.df[with(sample.df, order(f1, f2)),]
Map(function(x, y) new_df[with(new_df, f1 == x & f2 == y), -(1:2)],
new_df$f1, new_df$f2)
#[[1]]
# x1 x2 x3
#2 6.3 9.56 3.74
#[[2]]
# x1 x2 x3
#4 6.68 9.58 3.84
#[[3]]
# x1 x2 x3
#3 6.88 8.72 3.14
#[[4]]
# x1 x2 x3
#1 7.28 9.4 5.02
If the above one is your expected output then every row in new_df
is the output you want. If you want them as separate list we can also split
every row
split(new_df[, -(1:2)], seq_len(nrow(new_df)))
which would give you the same output.
Upvotes: 1
Reputation: 51582
Here is an idea via matrix
. Note this is not exactly the same as the output you require, but can easily be transformed.
Assuming df is your sample.df
,
m1 <- matrix(do.call(paste, df[with(df, order(f1, f2)),-c(1, 2)]), nrow = 2, byrow = TRUE)
m1[1, 2]
#[1] "6.68 9.58 3.84"
m1[1, 1]
#[1] "6.3 9.56 3.74"
m1[2, 1]
#[1] "6.88 8.72 3.14"
m1[2, 2]
#[1] "7.28 9.4 5.02"
You can get them as numeric vectors by splitting, i.e.
as.numeric(strsplit(m1[1, 2], ' ')[[1]])
#[1] 6.68 9.58 3.84
Upvotes: 3