E.Dennis
E.Dennis

Reputation: 11

Create a new data frame of the means of randomly selected rows - looped

Question:

I have a data.frame (hlth) that consists of 49 vectors - a mix of numeric(25:49) and factor(1:24). I am trying to randomly select 50 rows, then calculate column means only for the numeric columns (dropping the other values), and then place the random row mean(s) into a new data.frame (beta). I would then like to iterate this process 1000 times.

I have attempted this process but the values that get returned are identical and the new means will not enter the new data.frame

Here is a few rows and columns of the data.frame(hlth)

DateIn adgadj Sex VetMedCharges pwtfc 1/01/2006 3.033310 STEER 0.00 675.1151 1/10/1992 3.388245 STEER 2540.33 640.2261 1/10/1995 3.550847 STEER 572.78 607.6200 1/10/1996 2.893707 HEIFER 549.42 425.5217 1/10/1996 3.647233 STEER 669.18 403.8238

The code I have used thus far:

set.seed[25]
beta<-data.frame()

net.row<-function(n=50){
 netcol=sample(1:nrow(hlth),size=n ,replace=TRUE)
 rNames <- row.names(hlth)
 subset(hlth,rNames%in%netrow,select=c(25:49))
 colMeans(s1,na.rm=TRUE,dims=1)
 }

 beta$net.row=replicate(1000,net.row()); net.row

The two issues, that I have detected, are:

1) Returns the same value(s) each iteration

2) "Error during wrap-up: object of type 'closure' is not subsettable" when the beta$netrow

Any suggestions would be appreciated!!!

Upvotes: 1

Views: 119

Answers (1)

Akhil Nair
Akhil Nair

Reputation: 3274

Just adding to my comment (and firstly pasting it):

netcol=sample(1:nrow(hlth),size=n ,replace=TRUE) should presumably by netrow = ... and the error is a scoping problem - R is trying to subset the function beta, presumably again, because it can't find netRowMeans in the data.frame you've defined, moves on to the global environment and throws an error there.

There are also a couple of other things. You don't assign subset(hlth,rNames%in%netrow,select=c(25:49)) to a variable, which I think you mean to assign to s1, so colMeans is probably running on something you've set in the global environment.

If you want to pass a variable directly in to the data frame beta in that manner, you'll have to initialise beta with the right number of columns and number of rows - the column means you've passed out will be a vector of (1 x 25), so won't fit in a single column. You would probably be better of initalising a matrix called mat or something (to avoid confusion with scoping errors masking the actual error messages) with 25 columns and 1000 rows.

EDIT: Question has been edited slightly since I posted this, but most points still stand.

Upvotes: 1

Related Questions