Reputation: 421
I'm working on a program and now I'm looking for a way to check the column names when uploading a file. If the names are not unique an error should be written. Is there any way to do this?
For example if I have these df:
> a <- c(10, 20, 30)
> b <- c(1, 2, 3)
> c <- c("Peter", "Ann", "Mike")
> test <- data.frame(a, b, c)
with:
library(dplyr)
test <- rename(test, Number = a)
test <- rename(test, Number = b)
> test
Number Number c
1 10 1 Peter
2 20 2 Ann
3 30 3 Mike
If this were a file how could I check if the column names are unique. Nice would be as result only True or False!
Thanks!
Upvotes: 4
Views: 3310
Reputation: 13309
We can use:
any(duplicated(names(df))) #tested with df as iris
[1] FALSE
On OP's data:
any(duplicated(names(test)))
[1] TRUE
The above can be simplified using the following as suggested by @sindri_baldur and @akrun
anyDuplicated(names(test))
If you wish to know how many are duplicated:
length(which(duplicated(names(test))==TRUE))
[1] 1
This can also be simplified to(as suggested by @sindri_baldur:
sum(duplicated(names(test)))
Upvotes: 5
Reputation: 1503
test.frame <- data.frame(a = c(1:5), b = c(6:10))
a <- c(5:1)
test.frame <- cbind(test.frame, a)
## Build data.frame with duplicate column
test.unique <- function(df) { ## function to test unique columns
length1 <- length(colnames(df))
length2 <- length(unique(colnames(df)))
if (length1 - length2 > 0 ) {
print(paste("There are", length1 - length2, " duplicates", sep=" "))
}
}
This results in ...
test.unique(test.frame)
[1] "There are 1 duplicates"
Upvotes: 2
Reputation: 346
Check for the functions unique()
and colnames()
. For example:
are.unique.colnames <- function(array){
return(length(unique(colnames(array))) == dim(array)[2])
}
is a function based on the number of different column names (a easy and useful metadata of any array-like structure)
Upvotes: 0