Subset of data with replacement

Question

I am trying to sample a subset from data with replacement and here I show a simple example as follows:

dat <- data.frame (
  group = c(1,1,2,2,2,3,3,4,4,4,4,5,5), 
  var = c(0.1,0.0,0.3,0.4,0.8,0.5,0.2,0.3,0.7,0.9,0.2,0.4,0.6)
)

I just want to sample a subset based on the group numbers. If the group, e.g., group = 1, is selected, the whole group (two group members in my simple example above) will be selected. If the group was selected more than one times, the group number will be changed as a new group, e.g., 1.1, 1.1, 1.2, 1.2, …. The new data may look like this:

newdat <- data.frame (
  group = c(3,3,5,5,3.1,3.1,1,1,3.2,3.2,5.1,5.1,3.3,3.3,2,2,2), 
  var = c(0.5,0.2,0.4,0.6,0.5,0.2,0.1,0.0,0.5,0.2,0.4,0.6,0.5,0.2,0.3,0.4,0.8)
)

Any help would be greatly appreciated.

Josh O&#39;Brien · Accepted Answer

Here's a fairly simple solution that uses make.unique() to create the names of the groups in newdat:

## Your data
dat <- data.frame (
  group = c(1,1,2,2,2,3,3,4,4,4,4,5,5), 
  var = c(0.1,0.0,0.3,0.4,0.8,0.5,0.2,0.3,0.7,0.9,0.2,0.4,0.6)
) 
n <- c(3,5,3,1,3,2,5,3,2)

## Make a 'look-up' data frame that associates sampled groups with new names,
## then use merge to create `newdat`
df <- data.frame(group = n, 
                 newgroup = as.numeric(make.unique(as.character(n))))
newdat <- merge(df, dat)[-1]
names(newdat)[1] <- "group"

Subset of data with replacement

Answers (2)

Related Questions