Reputation: 7530
I have a data frame which contains client names and area data.
I want to calculate the total area for each client as some areas span over multiple floors (for example, Client A
may have 202
on Floor 1
and 248
on Floor 2
).
I want to create a new column with the total area.
I know how to create the new column:
areas$new_area
And I know how to calculate the total area for each client (manually):
sum(areas[areas$client == "Client A", "areas"])
What I am having difficulty with is iterating through the data frame and automating the entire process.
I came up with a partial solution that iterates through the data frame, but it only calculates the sum of each area value for every client at position i
(which I know will always happen because it only takes the single value in the area
column, of course):
for(i in 1:nrow(areas)){
areas$new_area[i] <- sum(areas$areas[i])
}
Also, I suspect/know that an apply
function is almost certainly the approach to take here, but I don't know which one to use nor how to apply it (no pun intended).
How can I a)
achieve this and b)
achieve it in a cleaner way?
My expected output is something like this (or some variation of it):
--------------------------------------
| Client | Floor | Area | New Area |
--------------------------------------
| A | 1 | 202 | 202 |
--------------------------------------
| A | 2 | 248 | 450 |
--------------------------------------
| B | 1 | 1000 | 1000 |
--------------------------------------
| B | 2 | 150 | 1150 |
--------------------------------------
I want a new column at the end with the total of all area values for each client (my example shows a cumulative total, but whether it is cumulative or not doesn't matter - it was merely for the purpose of giving an example).
Upvotes: 0
Views: 545
Reputation: 610
summedAreas <- aggregate(Area ~ Client, areas, sum)
allYourData <- merge(Area, summedAreas, by = "Client")
I prefer aggregate over tapply because I get a nice data.frame back, but you could calculate the totals with
tapply(X = areas$Area, INDEX = areas$Client, FUN = sum)
Upvotes: 1