stats_noob
stats_noob

Reputation: 5925

R: Extracting Non-Zero Elements From a Matrix

I am working with the R programming language.

I have the following matrix in R:

set.seed(123)
mat <- matrix(sample(0:1, 100, replace = TRUE), nrow = 10, ncol = 10, 
              dimnames = list(c('aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff', 'ggg', 'hhh', 'iii', 'jjj'), 
                              c('111', '222', '333', '444', '555', '666', '777', '888', '999', '101010')))
mat

This matrix looks something like this:

> mat
    111 222 333 444 555 666 777 888 999 101010
aaa   0   1   0   0   0   1   0   1   1      1
bbb   0   1   1   1   1   0   1   0   1      1
ccc   0   1   0   0   1   0   1   0   0      0
ddd   1   0   0   1   0   0   0   0   1      1
eee   0   1   0   1   0   0   0   0   1      1
fff   1   0   0   0   0   1   1   1   1      0
ggg   1   1   1   0   0   1   0   1   0      0
hhh   1   0   1   0   1   0   0   0   0      1
iii   0   0   0   0   0   1   0   1   1      0
jjj   0   0   1   1   0   0   0   1   0      0

My Question: Based on this matrix, I want to create a dataframe with 2 columns and 10 rows.

For example, the final result should look something like this:

  col1                             col2
1  aaa                222, 666, 888,999
2  bbb 222,333,444,555, 777,999, 101010

structure(list(col1 = c("aaa", "bbb"), col2 = c("222, 666, 888,999", 
"222,333,444,555, 777,999, 101010")), class = "data.frame", row.names = c(NA, 
-2L))

With a lot of trial and error, I tried the following code:

df <- data.frame(col1 = rownames(mat), 
                 col2 = apply(mat, 1, function(x) paste0(colnames(mat)[which(x != 0)], collapse = ", ")))

    col1                                 col2
aaa  aaa           222, 666, 888, 999, 101010
bbb  bbb 222, 333, 444, 555, 777, 999, 101010
ccc  ccc                        222, 555, 777
ddd  ddd                111, 444, 999, 101010
eee  eee                222, 444, 999, 101010
fff  fff              111, 666, 777, 888, 999
ggg  ggg              111, 222, 333, 666, 888
hhh  hhh                111, 333, 555, 101010
iii  iii                        666, 888, 999
jjj  jjj                        333, 444, 888

Can someone please tell me if I have done this correctly?

Thanks!

Upvotes: 0

Views: 41

Answers (1)

Kra.P
Kra.P

Reputation: 15143

You may try

stack(apply(mat, 1, function(x){col2 = paste0(as.numeric(names(which(x == 1))), collapse = ", ")}))

                              values ind
1              111, 888, 999, 101010 aaa
2    444, 555, 666, 777, 999, 101010 bbb
3    111, 222, 444, 555, 888, 101010 ccc
4            222, 333, 444, 555, 888 ddd
5              666, 777, 888, 101010 eee
6            111, 333, 555, 777, 999 fff
7    111, 222, 333, 444, 888, 101010 ggg
8                   555, 999, 101010 hhh
9       111, 222, 555, 777, 888, 999 iii
10 111, 222, 333, 444, 555, 666, 888 jjj

Upvotes: 1

Related Questions