user2813606
user2813606

Reputation: 921

Write Loop To Perform Function through Column Names

I have a dataset with a quantitative column for which I want to calculate the mean based on groups. The other columns in the dataset are titled [FY2001,FY2002,...,FY2018]. These columns are populated with either a 1 or 0.

I want to calculate the mean of the first column for each of the FY columns when they equal 1. So, I want 18 different means.

I am used to using macros in SAS where I can replace parts of a dataset name or column name using a let statement. This is my attempt at writing a loop in R to solve this problem:

vector = c("01","02","03","04","05","06","07","08","09","10",
         "11","12","13","14","15","16","17","18")
varlist = paste("FY20", vector, sep = "")

abc = for (i in length(varlist)){
    table(ALL_FY2$paste(varlist)[i])
}
abc

This doesn't work since it treats the paste function as a column. What am I missing? Any help would be appreciated.

Upvotes: 2

Views: 387

Answers (1)

akrun
akrun

Reputation: 887851

We can use [[ instead of & to subset the column. In addition, 'abc' should be a list which gets assigned with the corresponding table output of each column in the for loop.

abc <- vector("list", length(varlist)) # initialize a `list` object

Loop through the sequence of 'varlist' and not the length(varlist) (it is a single number)

for(i in seq_along(varlist)) abc[[i]] <- table(ALL_FY2[[varlist[i]]])

However, if we need to have a single table output from all the columns mentioned in the 'varlist', unlist the columns to a vector and replicate the sequence of columns before applying the table

ind <- rep(seq_along(varlist), each = nrow(ALL_FY2))
table(ind, unlist(ALL_FY2[varlist]))

Upvotes: 2

Related Questions