Reputation: 1538
I'm building up a data frame based on random entries/rows. Here's the function that creates a random entry:
createRandomEntry <- function() {
names <- c('Dilbert', 'Wally', 'Alice', 'Ashok', 'Topper')
ages <- 30:45
return(
data.frame(
Name = sample(names, 1),
Age = sample(ages, 1),
stringsAsFactors = FALSE
)
)
}
Now I'm combining them to one big data.frame
using this function:
createRandomEntries <- function(n) {
df <- createRandomEntry()
for (i in 2:n) {
df <- rbind(df, createRandomEntry())
}
return(df)
}
Technically, it works well, but it's a bit clumsy for many reasons:
createRandomEntry()
function at two placesrbind
that often might be inefficient, I don't know...In an earlier version, createRandomEntry()
returned a list
rather than a data.frame
. Then I used replicate()
to create a matrix, which first had to be transposed (by calling t()
on it) in order to create a data.frame
out of it. And that data.frame
wasn't sortable (error: "unimplemented type 'list' in 'orderVector1'"). Calling unlist()
on every row or returning a vector from createRandomEntry()
would fix the sorting issues, but then I'd just get strings in every column.
There must be a better way. But how?
Edit: It's important to have a function that creates one single entry, because some of the values of an entry could be related to each other, like this enhanced function shows:
createRandomEntry <- function() {
names <- c('Dilbert', 'Wally', 'Alice', 'Ashok', 'Topper')
ages <- 30:45
startedIn <- sample(1995:2005, 1)
lostMotivation <- startedIn + sample(1:3, 1)
return(
data.frame(
Name = sample(names, 1),
Age = sample(ages, 1),
StartYear = startedIn,
LostMotivation = lostMotivation,
stringsAsFactors = FALSE
)
)
}
createRandomEntries(3)
Which produces:
Name Age StartYear LostMotivation
1 Ashok 42 1998 2000
2 Dilbert 43 1997 1999
3 Dilbert 30 1996 1999
Upvotes: 0
Views: 52
Reputation: 1538
Based on Bruno Zamengo's answer, I've now rewritten the function:
createRandomEntries <- function(n) {
names <- c('Dilbert', 'Wally', 'Alice', 'Ashok', 'Topper')
ages <- 30:45
df <- data.frame(
Name = sample(names, n, replace = TRUE),
Age = sample(ages, n, replace = TRUE),
StartYear = sample(1995:2005, n, replace = TRUE),
stringsAsFactors = FALSE
)
df$LostMotivation <- df$StartYear + sample(1:3, n, replace = TRUE)
return(df)
}
However, I didn't use merge
, as suggested.
Upvotes: 0
Reputation: 860
Just move n
from the second function to the first one?
createRandomEntries <- function(n) {
names <- c('Dilbert', 'Wally', 'Alice', 'Ashok', 'Topper')
ages <- 30:45
return(
data.frame(
Name = sample(names, n, TRUE),
Age = sample(ages, n, TRUE),
stringsAsFactors = FALSE
)
)
}
Upvotes: 3