Ignacio
Ignacio

Reputation: 7948

Generate n unique names with R

I want to generate n unique names using R. This is the code I have right now:

library(randomNames)
set.seed(6232015)
gen.names <- function(n){
  while(n>0){ 
    names <- unique(randomNames(n=n))
    n <- n - length(names)
    names2 <- c(unique(randomNames(n=n)), names)
    n <- n - length(names2)
    }
  return(names2)
  }
nombres<- gen.names(n = 40000)

Instead of getting 40000 i'm getting 39999. If I ask for less than 40000 I get that number. What is wrong in my code?

Thanks!

Upvotes: 3

Views: 2099

Answers (2)

Ignacio
Ignacio

Reputation: 7948

Thanks @jeremycg!

This is my solution after reading your answer:

set.seed(6232015)
gen.names <- function(n){
  names <- unique(randomNames(n=n))
  need <- n - length(names)
  while(need>0){ 
    names <- unique(c(randomNames(n=need), names))
    need <- n - length(names)
    }
  return(names)
  }
nombres<- gen.names(n = 100000)

Upvotes: 0

jeremycg
jeremycg

Reputation: 24955

You are getting non-unique names in your second call, leading to loss of a name. Then the n calculation is broken, allowing you to leave the while loop.

Let's walk through it:

names <- unique(randomNames(n=n))
n <- n - length(names)

you got 38986 unique names, and n is now 1014

Now:

names2 <- c(unique(randomNames(n=n)), names)
n <- n - length(names2)

You got 1013 new unique names, giving 39999 total names in names2, and n is now 1014 - 39999 = -38985

You hit the end of the loop, and exit out as you are less than 1, and return your values with 1 missing name.

Here's a hacky solution, producing 2000 extra names, then checking in a loop:

gen.names <- function(n){
  names<-c()
  while(length(names)<n){ 
    names <- unique(c(randomNames(n=n+2000),names))
  }
  return(names[1:n])
}

Upvotes: 1

Related Questions