Foothill_trudger
Foothill_trudger

Reputation: 103

Group by with loop in R

Using USPersonalExpenditure, I'm trying to loop through the years, calculate the percentage change between each one for each row individually and then find the cumulative percentage growth.

So far, I've got:

library(dplyr)    
xyz <- USPersonalExpenditure
calc_tot <- function(xyz) {
    yr1 <- (xyz$1945-xyz$1940)/xyz$1940
    yr2 <- (xyz$1950-xyz$1945)/xyz$1945
    yr3 <- (xyz$1955-xyz$1950)/xyz$1950
    yr4 <- (xyz$1960-xyz$1955)/xyz$1955
    return(sum(yr1, yr2, yr3, yr4))
}
new_xyz %>%
    xyz %>%
    calc_tot

This returns:

[1] 0.01026966

which doesn't show the individual rows or the cumulative totals.

Any help would be much appreciated.

Upvotes: 0

Views: 44

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 388982

I didn't understand the cumulative percentage growth part but you can calculate percentage growth within years by -

calc_tot <- function(xyz) {
  (xyz[, -1] - xyz[, -ncol(xyz)])/xyz[, -ncol(xyz)]
}

calc_tot(xyz)

#                     1945  1950  1955  1960
#Food and Tobacco    1.005 0.339 0.228 0.186
#Household Operation 0.476 0.871 0.259 0.266
#Medical and Health  0.632 0.686 0.442 0.507
#Personal Care       0.904 0.237 0.388 0.588
#Private Education   1.856 0.848 0.444 0.400

Note that you have one column less than the original input. The first column in the output is basically (1945-1940)/1940 and so on for other columns.

Upvotes: 1

IRTFM
IRTFM

Reputation: 263342

Maybe this should have been:

calc_tot <- function(xyz) {
    yr1 <- (xyz$1945-xyz$1940)/xyz$1940
    yr2 <- (xyz$1950-xyz$1945)/xyz$1945
    yr3 <- (xyz$1955-xyz$1950)/xyz$1950
    yr4 <- (xyz$1960-xyz$1955)/xyz$1955
    return(sum(yr1, yr2, yr3, yr4))
}
new_xyz <-  xyz %>%
              calc_tot

Perhaps you need to understand that the assignment operation (<-) is quite different than function composition (%>%).

Upvotes: 0

Related Questions