Timothy_Goodman
Timothy_Goodman

Reputation: 421

Function to check if column names are unique

I'm working on a program and now I'm looking for a way to check the column names when uploading a file. If the names are not unique an error should be written. Is there any way to do this?

For example if I have these df:

> a <- c(10, 20, 30)
> b <- c(1, 2, 3)
> c <- c("Peter", "Ann", "Mike")
> test <- data.frame(a, b, c)

with:

library(dplyr)
test <- rename(test, Number = a)
test <- rename(test, Number = b)
> test
  Number Number     c
1     10      1 Peter
2     20      2   Ann
3     30      3  Mike

If this were a file how could I check if the column names are unique. Nice would be as result only True or False!

Thanks!

Upvotes: 4

Views: 3310

Answers (3)

NelsonGon
NelsonGon

Reputation: 13309

We can use:

any(duplicated(names(df))) #tested with df as iris
[1] FALSE

On OP's data:

any(duplicated(names(test)))
[1] TRUE

The above can be simplified using the following as suggested by @sindri_baldur and @akrun

anyDuplicated(names(test))

If you wish to know how many are duplicated:

length(which(duplicated(names(test))==TRUE))
[1] 1

This can also be simplified to(as suggested by @sindri_baldur:

sum(duplicated(names(test)))

Upvotes: 5

DarrenRhodes
DarrenRhodes

Reputation: 1503

test.frame <- data.frame(a = c(1:5), b = c(6:10))
a <- c(5:1)
test.frame  <- cbind(test.frame, a)

## Build data.frame with duplicate column

test.unique <- function(df) {  ## function to test unique columns

  length1 <- length(colnames(df))
  length2 <- length(unique(colnames(df)))        
  if (length1 - length2 > 0 ) {

    print(paste("There are", length1 - length2, " duplicates", sep=" "))
  }     
}

This results in ...

test.unique(test.frame)

[1] "There are 1 duplicates"

Upvotes: 2

Elie Ker Arno
Elie Ker Arno

Reputation: 346

Check for the functions unique() and colnames(). For example:

are.unique.colnames <- function(array){
  return(length(unique(colnames(array))) == dim(array)[2])
}

is a function based on the number of different column names (a easy and useful metadata of any array-like structure)

Upvotes: 0

Related Questions