Reputation: 455
I've got a dataframe that looks like this:
Sample_1 Sample_2 Sample_1 Sample_2
1979 0.22 0.50 0.22 0.67
1980 0.15 0.30 0.15 0.77
I want to remove duplicated Sample_1
, because they are identical (they have the same values for the same years). However, I want to keep Sample_2
because even though the name is duplicated, the values aren't - so I want to rename these types of duplicated columns something else in order to keep them (for example Sample_2_edit
or Sample_2_
).
How can I do this?
Upvotes: 1
Views: 143
Reputation: 887213
Assuming the data is a matrix
as data.frame
or data.table
wouldn't allow duplicate column names and data.table
doesn't even allow row names. We can apply duplicated
on the column names and the values i.e. columns of the data split into a list
(asplit
). Drop those columns that are duplicate for both cases and then rename the duplicate columns by making it unique with make.unique
m2 <- m1[, !(duplicated(asplit(m1, 2)) & duplicated(colnames(m1))), drop = FALSE]
colnames(m2) <- make.unique(colnames(m2))
-output
m2
# Sample_1 Sample_2 Sample_2.1
#1979 0.22 0.5 0.67
#1980 0.15 0.3 0.77
m1 <- structure(c(0.22, 0.15, 0.5, 0.3, 0.22, 0.15, 0.67, 0.77), .Dim = c(2L,
4L), .Dimnames = list(c("1979", "1980"), c("Sample_1", "Sample_2",
"Sample_1", "Sample_2")))
Upvotes: 1