Reputation: 1
Hello I am fairly new to R and coding in general and couldn't seem to find a solution to the current problem I am having. I currently have a dataset that looks something along the lines of this:
Passage Name | Guild | Passage yes/no |
---|---|---|
A | Rheophilic | 1 |
A | Eurytopic | 1 |
A | Eurytopic | 1 |
A | N/A | 0 |
A | Eurytopic | 1 |
A | Rheophilic | 1 |
A | Eurytopic | 1 |
A | Rheophilic | 0 |
A | Rheophilic | 1 |
B | Eurytopic | 1 |
B | Eurytopic | 1 |
B | Eurytopic | 0 |
B | Eurytopic | 1 |
B | Eurytopic | 1 |
B | Eurytopic | 1 |
B | Rheophilic | 1 |
B | Limnophilic | 1 |
B | Limnophilic | 0 |
Here I want to calculate the percentage of passage per Passage name and guild and put this into a new data frame. So for example in passage A, 4 eurytopic species where found of which 3 passed so 75% of Eurytopic species passed.
So the new data frame for passage A should look something like this:
Passage Name | Guild | Passage % |
---|---|---|
A | Rheophilic | 100% |
A | Eurytopic | 75% |
I have previously calculated this by hand for every passage in my dataset, however I feel like it can be much smoother done if calculated by R, this would also prevent mistakes made by me. I have previously tried methods like this post: (https://www.stackoverflow.com/questions/30951617/how-to-count-and-calculate-percentages-for-two-columns-in-an-r-data-frame). However, this doesn't fully answer my question and I am too inexperienced with R to fix this by myself. So, if someone likes to teach me that would be greatly appreciated. I am also quite new to stack overflow so sorry if the formatting isn't perfect.
Upvotes: 0
Views: 312
Reputation: 887108
We may do
library(dplyr)
df1 %>%
na.omit %>%
group_by(`Passage Name`, Guild) %>%
summarise(Passage% = 100 * mean(`Passage yes/no`))
Upvotes: 2