Mus
Mus

Reputation: 7530

How to calculate the total value based on multiple column values

I have a data frame which contains client names and area data.

I want to calculate the total area for each client as some areas span over multiple floors (for example, Client A may have 202 on Floor 1 and 248 on Floor 2).

I want to create a new column with the total area.

I know how to create the new column:

areas$new_area

And I know how to calculate the total area for each client (manually):

sum(areas[areas$client == "Client A", "areas"])

What I am having difficulty with is iterating through the data frame and automating the entire process.

I came up with a partial solution that iterates through the data frame, but it only calculates the sum of each area value for every client at position i (which I know will always happen because it only takes the single value in the area column, of course):

for(i in 1:nrow(areas)){
  areas$new_area[i] <- sum(areas$areas[i])
}

Also, I suspect/know that an apply function is almost certainly the approach to take here, but I don't know which one to use nor how to apply it (no pun intended).

How can I a) achieve this and b) achieve it in a cleaner way?

My expected output is something like this (or some variation of it):

--------------------------------------
| Client | Floor | Area |  New Area  |
--------------------------------------
|   A    |   1   | 202  |    202     |
--------------------------------------
|   A    |   2   | 248  |    450     |
--------------------------------------
|   B    |   1   | 1000 |    1000    |
--------------------------------------
|   B    |   2   | 150  |    1150    |
--------------------------------------

I want a new column at the end with the total of all area values for each client (my example shows a cumulative total, but whether it is cumulative or not doesn't matter - it was merely for the purpose of giving an example).

Upvotes: 0

Views: 545

Answers (1)

CCD
CCD

Reputation: 610

summedAreas <- aggregate(Area ~ Client, areas, sum)
allYourData <- merge(Area, summedAreas, by = "Client")

I prefer aggregate over tapply because I get a nice data.frame back, but you could calculate the totals with

tapply(X = areas$Area, INDEX = areas$Client, FUN = sum)

Upvotes: 1

Related Questions