Reputation: 961
I have a dataset which looks something like this:
Area Num
[1,] "Area 1" "99"
[2,] "Area 3" "85"
[3,] "Area 1" "60"
[4,] "Area 2" "90"
[5,] "Area 1" "40"
[6,] "Area 3" NA
[7,] "Area 4" "10"
...
code:
structure(c("Area 1", "Area 3", "Area 1", "Area 2", "Area 1",
"Area 3", "Area 4", "99", "85", "60", "90", "40", NA, "10"), .Dim = c(7L,
2L), .Dimnames = list(NULL, c("Area", "Num")))
I need to do some calculation on values in Num
for each Area
, for example calculating the sum of each Area
, or the summary
of each Area
.
I'm thinking of using a nested
for loop to achieve this, but I'm not sure how to.
Upvotes: 0
Views: 81
Reputation: 4037
In order to apply the function to every level of the factor, we can recurse to the by
function:
dt <- structure(c("Area 1", "Area 3", "Area 1", "Area 2", "Area 1",
"Area 3", "Area 4", "99", "85", "60", "90", "40", NA, "10"), .Dim = c(7L, 2L), .Dimnames = list(NULL, c("Area", "Num")))
dt <- data.frame(dt)
dt$Num <- as.numeric(dt$Num)
t <- by(dt$Num, dt$Area, sum)
t
Upvotes: 2
Reputation: 3427
Doing the same thing using data.table
library(data.table)
dt <- data.table(df)
dt[,sum(as.numeric(Num),na.rm=T),by=Area]
## Area V1
## 1: Area 1 199
## 2: Area 3 85
## 3: Area 2 90
## 4: Area 4 10
Upvotes: 1
Reputation: 1344
You can do this using aggregate
, but the dplyr
package makes it very easy to work with such problems. There are plenty of duplicates of this question, though.
library(dplyr)
df <- structure(c("Area 1", "Area 3", "Area 1", "Area 2", "Area 1",
"Area 3", "Area 4", "99", "85", "60", "90", "40", NA, "10"), .Dim = c(7L,
2L), .Dimnames = list(NULL, c("Area", "Num")))
df <- data.frame(df)
df$Num <- as.numeric(df$Num)
df2 <- df %>%
group_by(Area) %>%
summarise(totalNum = sum(Num, na.rm=T))
df2
Upvotes: 2