rnorouzian
rnorouzian

Reputation: 7517

nested sampling of a data.frame in R

In data.frame p below, there are 757 unique district names (dname) & 5210 unique school names (sname).

I was wondering how to sample 126 snames (schools) from 40 dnames (districts) in R?

So, in the final sample (say X), dim(table(X$dname, X$sname)) must return: > [1] 40 126

In a sense, this is multi-stage sampling, so I'm open to any packages.

p <- read.csv("https://raw.githubusercontent.com/hkil/m/master/a.csv")

Upvotes: 2

Views: 177

Answers (1)

ThomasIsCoding
ThomasIsCoding

Reputation: 101064

I guess you can try the code below for this sort of multi-stage sampling

unq_dname <- unique(p$dname)
repeat {
  out <- subset(p, dname %in% sample(unq_dname, 40))
  if (length(unique(out$sname)) == 126) break
}

and you can check the dimensions via

dim(with(out,table(dname,sname)))

Upvotes: 0

Related Questions