Newcomer
Newcomer

Reputation: 23

How to use grep function in for loop

I have troubles using the grep function within a for loop. In my data set, I have several columns where only the last 5-6 letters change. With the loop I want to use the same functions for all 16 situations.

Here is my code:

situations <- c("KKKTS", "KKKNL", "KKDTS", "KKDNL", "NkKKTS", "NkKKNL", "NkKDTS", "NkKDNL", "KTKTS", "KTKNL", "KTDTS", "KTDNL", "NkTKTS", "NkTKNL", "NkTDTS", "NkTDNL")
View(situations)

for (i in situations[1:16]) {
  
  ## Trust Skala
  a <- vector("numeric", length = 1L)
  b <- vector("numeric", length = 1L)
  a <- grep("Tru_1_[i]", colnames(cleandata))
  b <- grep("Tru_5_[i]", colnames(cleandata))
  cleandata[, c(a:b)] <- 8-cleandata[, c(a:b)]
  
  attach(cleandata)
  cleandata$scale_tru_[i] <- (Tru_1_[i] + Tru_2_[i] + Tru_3_[i] + Tru_4_[i] + Tru_5_[i])/5
  detach(cleandata)
}

With the grep function I first want to finde the column number of e.g. Tru_1_KKKTS and Tru_5_KKKTS. Then I want to reverse code the items of the specific column numbers. The last part worked without the loop when I manually used grep for every single situation.

Here ist the manual version:

# KKKTS
grep("Tru_1_KKKTS", colnames(cleandata)) #29 -> find the index of respective column
grep("Tru_5_KKKTS", colnames(cleandata)) #33
cleandata[,c(29:33)] <- 8-cleandata[c(29:33)] # trust scale ranges from 1 to 7 [8-1/2/3/4/5/6/7 = 7/6/5/4/3/2/1]

attach(cleandata)
cleandata$scale_tru_KKKTS <- (Tru_1_KKKTS + Tru_2_KKKTS + Tru_3_KKKTS + Tru_4_KKKTS + Tru_5_KKKTS)/5
detach(cleandata)

Upvotes: 1

Views: 266

Answers (3)

Gwang-Jin Kim
Gwang-Jin Kim

Reputation: 9865

situations <- c("KKKTS", "KKKNL", "KKDTS", "KKDNL", "NkKKTS", "NkKKNL", "NkKDTS", "NkKDNL", "KTKTS", "KTKNL", "KTDTS", "KTDNL", "NkTKTS", "NkTKNL", "NkTDTS", "NkTDNL")

# constructor for column names
get_col_names <- function(part) paste("Tru", 1:5, part, sep="_")

for (situation in situtations) {
  # revert the values in the columns in situ
  cleandata[, get_col_names(situation)] <- 8 - cleandata[, get_col_names(situtation)]
  # and calculate the average
  subdf <- cleandata[, get_col_names(situation)]
  cleandata[, paste0("scale_tru_", situation)] <- rowSums(subdf)/ncol(subdf)
}

By the way, you call it "scale" but your code shows an average/mean calculation. (Scale without centering).

Upvotes: 0

jogo
jogo

Reputation: 12559

You can do:

Mean5 <- function(sit) {
  cnames <- paste0("Tru_", 1:5, "_", sit)
  rowMeans(cleandata[cnames])
}

cleandata[, paste0("scale_tru_", situations)] <- sapply(situations, FUN=Mean5)

Upvotes: 1

Edo
Edo

Reputation: 7818

how about something like this. It's a bit more compact and you don't have to use attach..

situations <- c("KKKTS", "KKKNL", "KKDTS", "KKDNL", "NkKKTS", "NkKKNL", "NkKDTS", "NkKDNL", "KTKTS", "KTKNL", "KTDTS", "KTDNL", "NkTKTS", "NkTKNL", "NkTDTS", "NkTDNL")

for (i in situations[1:16]) {

    cols   <- paste("Tru",   1:5, i, sep = "_")
    result <- paste("scale_tru" , i, sep = "_")
    
    cleandata[cols] <- 8 - cleandata[cols]
    cleandata[result] <- rowMeans(cleandata[cols])
    
}

I took for granted that when you write a:b you mean all the columns between those, which I assumed were named from 2 to 4

Upvotes: 0

Related Questions