florian
florian

Reputation: 696

r - how to avoid looping here?

I am changing a program I wrote. Initially the program was designed to analyze data for a "nS" vector with a constant 6 elements. Now I want the program to handle nS vectors with dynamic elements from 1 to 100.

How can I redesign the section under "# other need states" (ideally using an apply function and not a loop)?

# calculate dominant package size for each needstate
package <- as.factor(levels(df$Bagtype)) # name vector with package names
qoo <- data.frame(lapply(package, function(x) sum(df$Nettoerloes[df$NeedState == nS[1] & df$Bagtype == package[x]], na.rm = T))) # first need state  + create data frame
names(qoo) <- package # name columns

# other need states

qoo[2,] <- lapply(package, function(x) sum(df$Nettoerloes[df$NeedState == nS[2] & df$Bagtype == package[x]], na.rm = T))
qoo[3,] <- lapply(package, function(x) sum(df$Nettoerloes[df$NeedState == nS[3] & df$Bagtype == package[x]], na.rm = T))
qoo[4,] <- lapply(package, function(x) sum(df$Nettoerloes[df$NeedState == nS[4] & df$Bagtype == package[x]], na.rm = T)) 
qoo[5,] <- lapply(package, function(x) sum(df$Nettoerloes[df$NeedState == nS[5] & df$Bagtype == package[x]], na.rm = T))
qoo[6,] <- lapply(package, function(x) sum(df$Nettoerloes[df$NeedState == nS[6] & df$Bagtype == package[x]], na.rm = T))

row.names(qoo) <- nS #name rows

Upvotes: 0

Views: 68

Answers (1)

JMenezes
JMenezes

Reputation: 1059

The most efficient way I can see to solve this problem is to use tapply.

First remove from the dataset all lines where df$NeedState does not contain a number of nS.

df2<-df[!(df$NeedState %in% nS),] 

After that we can use tapply to execute the sum:

qoo<-tapply(df2$Nettoerloes, list(df2$NeedState,df2$Bagtype), sum)

tapply executes the function it was given, in this case sum, for every combination of the variables in list().

That should work irrespective of how many states nS contains.

Upvotes: 3

Related Questions