fifigoblin
fifigoblin

Reputation: 455

Remove some duplicated columns & rename others based on conditions

I've got a dataframe that looks like this:

     Sample_1  Sample_2  Sample_1 Sample_2
1979 0.22      0.50      0.22     0.67
1980 0.15      0.30      0.15     0.77 

I want to remove duplicated Sample_1, because they are identical (they have the same values for the same years). However, I want to keep Sample_2 because even though the name is duplicated, the values aren't - so I want to rename these types of duplicated columns something else in order to keep them (for example Sample_2_edit or Sample_2_).

How can I do this?

Upvotes: 1

Views: 143

Answers (1)

akrun
akrun

Reputation: 887213

Assuming the data is a matrix as data.frame or data.table wouldn't allow duplicate column names and data.table doesn't even allow row names. We can apply duplicated on the column names and the values i.e. columns of the data split into a list (asplit). Drop those columns that are duplicate for both cases and then rename the duplicate columns by making it unique with make.unique

m2 <-  m1[, !(duplicated(asplit(m1, 2)) & duplicated(colnames(m1))), drop = FALSE]
colnames(m2) <- make.unique(colnames(m2))

-output

m2
#     Sample_1 Sample_2 Sample_2.1
#1979     0.22      0.5       0.67
#1980     0.15      0.3       0.77

data

m1 <- structure(c(0.22, 0.15, 0.5, 0.3, 0.22, 0.15, 0.67, 0.77), .Dim = c(2L, 
4L), .Dimnames = list(c("1979", "1980"), c("Sample_1", "Sample_2", 
"Sample_1", "Sample_2")))

Upvotes: 1

Related Questions