Reputation: 301
I want to calculate the percentage of observations that match a certain criteria and then add that value to a new data frame in a cell that has the same criteria as the column and row names. I then want to create a separate data frame for each month represented in the data. The data I'm pulling from looks like this:
Occurrence Total Criteria1 Criteria2 Month
1 20 A 2016 Jan
5 50 B 2016 Feb
0 10 C 2016 Mar
1 50 A 2017 Jan
5 10 B 2017 Feb
0 20 C 2017 Mar
The new data frames would look like this:
(Jan) 2016 2017
A 0.05 0.02
(Feb)
B 0.1 0.5
(Mar)
C 0 0
So I'm trying to write a for loop or something comparable that calculates the percentage of occurrences and then add them to a new, empty data frame based on the criteria on which they were grouped in the first place. So far my code looks like this:
for(i in unique(data$month)){
df %>%
group_by(Criteria1, Criteria2) %>%
summarise(Perc = Occurrence / Total) %>%
spread(Criteria2, Perc)}
Upvotes: 2
Views: 1831
Reputation: 50678
A base R option using xtabs
xtabs(Perc ~ Criteria1 + Criteria2, transform(df, Perc = Occurrence / Total))
# Criteria2
#Criteria1 2016 2017
# A 0.05 0.02
# B 0.10 0.50
# C 0.00 0.00
Or a tidyverse
option
library(tidyverse)
df %>%
group_by(Criteria1, Criteria2) %>%
summarise(Perc = Occurrence / Total) %>%
spread(Criteria2, Perc)
## A tibble: 3 x 3
## Groups: Criteria1 [3]
# Criteria1 `2016` `2017`
# <fct> <dbl> <dbl>
#1 A 0.05 0.02
#2 B 0.1 0.5
#3 C 0 0
For your updated data
df %>%
group_by(Criteria1, Criteria2, Month) %>%
summarise(Perc = Occurrence / Total) %>%
spread(Criteria2, Perc)
## A tibble: 3 x 4
## Groups: Criteria1 [3]
# Criteria1 Month `2016` `2017`
# <fct> <fct> <dbl> <dbl>
#1 A Jan 0.05 0.02
#2 B Feb 0.1 0.5
#3 C Mar 0 0
Or something like this in base R
xtabs(
Perc ~ Criteria1 + Criteria2,
transform(df, Perc = Occurrence / Total, Criteria1 = paste(Criteria1, Month, sep = "_")))
# Criteria2
#Criteria1 2016 2017
#A_Jan 0.05 0.02
#B_Feb 0.10 0.50
#C_Mar 0.00 0.00
Upvotes: 1