Reputation: 105
I have generated a tibble which is formatted like this:
V1 n
1 "Sam,Chris" 30
2 "Sam,Peter" 81
3 "Jeff,James" 5
4 "David,Jones" 6
5 "Harry,Otto" 8
I also have a large matrix where every row and column is titled after a name, and each name appears once. So I need to split each row of V1, So that the index of the matrix that is:
[Sam]
[Chris]30
For example, so I'd need to somehow split by the comma and then fill the matrix, how would I go about it?
Upvotes: 1
Views: 134
Reputation: 887118
We may need to use separate_rows
library(tidyverse)
df1 %>%
separate_rows(V1, sep=",")
If we want to get the output as a matrix
df1 %>%
separate(V1, into = c("V1", "V2"), sep=",") %>%
spread(V2, n, fill = 0) %>%
column_to_rownames("V1")
# Chris James Jones Otto Peter
#David 0 0 6 0 0
#Harry 0 0 0 8 0
#Jeff 0 5 0 0 0
#@Sam 30 0 0 0 81
It can be converted to a square matrix, by including the first and last names both in row and column names
tmp <- df1 %>%
separate(V1, into = c("V1", "V2"), sep=",")
lvls <- sort(unique(unlist(tmp[1:2])))
tmp %>%
mutate_at(vars(V1, V2), factor, levels = lvls) %>%
spread(V2, n, fill = 0, drop = FALSE)
df1 <- structure(list(V1 = c("Sam,Chris", "Sam,Peter", "Jeff,James",
"David,Jones", "Harry,Otto"), n = c(30L, 81L, 5L, 6L, 8L)),
class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
Upvotes: 1