R Dataframe make a new column combining row and column names

Question

I have a similarity matrix (SimMat) which I've converted to a dataframe by writing it out as .csv and re-reading it into the environment.

   A    B   C
A  1   0.3 0.7   
B 0.3   1  0.5   
C 0.7  0.5  1

I want to change this so that I have a dataframe detailing the unique comparisons and values like below:

Comp  Val
A-A    1
A-B   0.3
A-C   0.7
B-B    1 
B-C   0.5
C-C    1

Does anyone know how I might be able to do this?

Ronak Shah · Accepted Answer

Using tidyverse, we can get rownames as new column, get the data in long format and combine rownames and column names of the data.

library(tidyverse)

df %>%
  #If it is a matrix convert to dataframe
  #as.data.frame() %>%
  rownames_to_column() %>%
  pivot_longer(cols = -rowname) %>%
  unite(name, rowname, name, sep = "-")

#  name  value
#   
#1 A-A     1  
#2 A-B     0.3
#3 A-C     0.7
#4 B-A     0.3
#5 B-B     1  
#6 B-C     0.5
#7 C-A     0.7
#8 C-B     0.5
#9 C-C     1

To get only the unique values, we can use pmin and pmax.

df %>%
  #as.data.frame() %>%
  rownames_to_column() %>%
  pivot_longer(cols = -rowname) %>%
  mutate(newcol1 = pmin(rowname, name), newcol2 = pmax(rowname, name)) %>%
  select(-rowname, -name) %>%
  distinct() %>%
  unite(Comp, newcol1, newcol2, sep = "-")

data

df <- structure(list(A = c(1, 0.3, 0.7), B = c(0.3, 1, 0.5), C = c(0.7, 
0.5, 1)), class = "data.frame", row.names = c("A", "B", "C"))

R Dataframe make a new column combining row and column names

Answers (2)

data

Related Questions