Reputation: 1
I have a data frame in R but am struggling to create percentages from my column of frequencies
Species Year State Disease.status F
1 Hedgehog 1870 L Negative 7
2 Hedgehog 1870 L Positive 17
3 Badger 1990 K Positive 8
4 Blue Tit 1991 P Negative 5
I want to code to create a data frame which creates a percentage from the "F" column which gives the percentage of species in a year, in that state that produced positive results in "Disease.status" instead of negative. Am new to R and any help would be appreciated.
The table below is what I'd like to produce from the table above.
Species Year State F%
1 Hedgehog 1870 L 29
2 Badger 1990 K 100
3 Blue Tit 1991 P 0
Upvotes: 0
Views: 218
Reputation: 3722
You can spread
the Disease.status column into wide format and then calculate, which is probably my preferred method.
df <- data.frame(Species = c("Hedgehog", "Hedgehog", "Badger", "Blue Tit"), State = c("L", "L", "K", "P"), Disease.status = c("Negative", "Positive", "Positive", "Negative"), F = c(7, 17, 8, 5))
new_df <- df %>%
spread(Disease.status, F) %>%
mutate(Negative = ifelse(is.na(Negative), 0, Negative),
Positive = ifelse(is.na(Positive), 0, Positive),
F_pcent = Negative/(Negative + Positive))
new_df
Species State Negative Positive F_pcent
1 Badger K 0 8 1.0000000
2 Blue Tit P 5 0 0.0000000
3 Hedgehog L 7 17 0.2916667
Upvotes: 1
Reputation: 17648
You can try:
# read your data
d <- read.table(text="Species Year State Disease.status F
Hedgehog 1870 L Negative 7
Hedgehog 1870 L Positive 17
Badger 1990 K Positive 8
BlueTit 1991 P Negative 5", header=T)
library(tidyverse)
d %>%
group_by(Species, Year, State) %>%
summarise(total=sum(F),
percent= round(sum(ifelse(Disease.status == "Positive",F/total,0))*100,1))
# A tibble: 3 x 5
# Groups: Species, Year [?]
Species Year State total percent
<fctr> <int> <fctr> <int> <dbl>
1 Badger 1990 K 8 100.0
2 BlueTit 1991 P 5 0.0
3 Hedgehog 1870 L 24 70.8
The idea is to group the data, then calcualte the total number in each group using the summarise function. The result is cacluated using ifelse
setting negatives to zero
and summing up the percentages. All in one using tidyverse/dplyr
Upvotes: 0