sgatta
sgatta

Reputation: 35

Replace a value in a dataframe with the row name

I have a data frame that looks like this:

         Jill Jimmie Alex Jane 
    Jill   1    0      1    1    
    Jimmie 0    1      1    0    
    Alex   0    1      1    0    
    Jane   1    1      1    0    

I want to change every 1 to its corresponding row name, such as this:

         Jill Jimmie Alex Jane 
    Jill Jill    0   Jill  Jill  
    Jimmie 0  Jimmie Jimmie 0   
    Alex   0   Alex  Alex   0   
    Jane Jane  Jane  Jane   0  

After that, I want to remove all zeros from the data frame and move up the values in the columns.

I've tried:

    # for (i in ibm_data){
    #   if (ibm_data == 1){
    #     names <- row.names(i)
    #     ibm_data[ibm_data == 1] <- names
    #   }
    #   else{
    #     ibm_data[ibm_data == 0] <- "NA"
    #   }
    # }

And then I would remove by NA values, but I think I am making this overly complicated. I plan to build a forced graph from the list of names to see correlations.

Upvotes: 3

Views: 3218

Answers (3)

M--
M--

Reputation: 28825

While I like the @akrun's one-liner I want to post a more elaborated answer:

k <- which(df==1, arr.ind=TRUE)
df[k] <- rownames(k)

df

#        Jill Jimmie   Alex Jane 
# Jill   Jill      0   Jill Jill 
# Jimmie    0 Jimmie Jimmie    0 
# Alex      0   Alex   Alex    0 
# Jane   Jane   Jane   Jane    0

Data

read.table(text='Jill Jimmie Alex Jane 
       Jill   1    0      1    1    
       Jimmie 0    1      1    0    
       Alex   0    1      1    0    
       Jane   1    1      1    0 ', header = T, quote ='"') -> df

Upvotes: 1

lmo
lmo

Reputation: 38500

I think this accomplishes both the filling in of the rownames for cell values with 1 as well as removing "all zeros from the data frame and move up the values in the columns."

This returns a named list where each list element is the corresponding column and the values are the rownames for which the value in the column is equal to 1.

lapply(dat, function(x) rownames(dat)[x==1])
$Jill
[1] "Jill" "Jane"

$Jimmie
[1] "Jimmie" "Alex"   "Jane"  

$Alex
[1] "Jill"   "Jimmie" "Alex"   "Jane"  

$Jane
[1] "Jill"

Upvotes: 2

akrun
akrun

Reputation: 886948

Here is one option with replace/row

df1[] <- replace(row.names(df1)[row(df1)*(NA^!df1)], !df1, 0)
df1
#       Jill Jimmie   Alex Jane
#Jill   Jill      0   Jill Jill
#Jimmie    0 Jimmie Jimmie    0
#Alex      0   Alex   Alex    0
#Jane   Jane   Jane   Jane    0

The idea is to get the row index of the rows with row function, replace the index where it is 0 in the original data.frame to NA, use that index to get the corresponding row names, and replace the NA to 0 by using the logical matrix (!df1 - returns TRUE where there are 0 values and FALSE for 1)


Or a more straightforward way is

df1[] <- replace(row.names(df1)[row(df1)], !df1, 0)

data

df1 <- structure(list(Jill = c(1L, 0L, 0L, 1L), Jimmie = c(0L, 1L, 1L, 
1L), Alex = c(1L, 1L, 1L, 1L), Jane = c(1L, 0L, 0L, 0L)), .Names = c("Jill", 
"Jimmie", "Alex", "Jane"), class = "data.frame", row.names = c("Jill", 
"Jimmie", "Alex", "Jane"))

Upvotes: 1

Related Questions