user2895129
user2895129

Reputation:

Row in dataframe to matrix (for proximity ratings)

A number of participants (p1, p2, ...) gave proximity ratings for all pairwise combinations of 4 words (w1.w2, w1.w3, ..., w3.w4), giving the following dataframe:

id  w1.w2  w1.w3  w1.w4  w2.w3  w2.w4  w3.w4  
p1      3      1      6      3      5      2
p2      2      3      5      1      6      1
p3 .....

I would like to convert these ratings into a series of matrices to apply multidimensional scaling to them (1 matrix by participant).
I would like to convert my data to the following format:

id  first.wd.in.pair  w2  w3  w4  
p1                w1   3   1   6  
p1                w2       3   5  
p1                w3           2
p2                w1   2   3   5  
p2                w2       1   6  
p2                w3           1  
p3 .....

I've looked into all kinds of reformatting options (e.g. cast in reshape2), but nothing seems to fit my issue.
I've also looked at functions for adjacency matrix (such as get.adjacency() in igraph, but from what I saw it seemed to require something in the following format:

id    first.word   second.word   rating
p1            w1            w2        3  
p1            w1            w3        1  
p1            w1            w4        6  
p1  ....

Thanks in advance for any help!

Upvotes: 0

Views: 81

Answers (1)

A5C1D2H2I1M1N2O1R2T1
A5C1D2H2I1M1N2O1R2T1

Reputation: 193547

The easiest approach is melt and dcast from "reshape2".

I don't know what you tried, but it is pretty standard-procedure except for one step: splitting the molten "variable" column. Assuming your input data.frame is called "mydf":

dfL <- melt(mydf, id.vars="id")
dfL <- cbind(dfL, colsplit(dfL$variable, "\\.", c("first", "other")))
dcast(dfL, id + first ~ other, value.var="value", fill=0)
#   id first w2 w3 w4
# 1 p1    w1  3  1  6
# 2 p1    w2  0  3  5
# 3 p1    w3  0  0  2
# 4 p2    w1  2  3  5
# 5 p2    w2  0  1  6
# 6 p2    w3  0  0  1

Here, "mydf" is defined as:

mydf <- structure(list(id = c("p1", "p2"), w1.w2 = c(3L, 2L), w1.w3 = c(1L, 
    3L), w1.w4 = c(6L, 5L), w2.w3 = c(3L, 1L), w2.w4 = 5:6, w3.w4 = c(2L, 
    1L)), .Names = c("id", "w1.w2", "w1.w3", "w1.w4", "w2.w3", "w2.w4", 
    "w3.w4"), class = "data.frame", row.names = c(NA, -2L))

Please share your sample data in such a format in the future.

Upvotes: 1

Related Questions