Reputation: 8494
For a sample dataframe:
df <- structure(list(name = c("a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z", "a", "b", "c", "d", "e", "f", "g", "h",
"i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u",
"v", "w", "x", "y", "z"), amount = c(11L, 9L, 5L, 13L, 15L, 16L,
2L, 5L, 6L, 8L, 9L, 15L, 16L, 17L, 13L, 11L, 10L, 9L, 8L, 7L,
6L, 8L, 15L, 16L, 15L, 9L, 8L, 7L, 6L, 5L, 18L, 16L, 1L, 14L,
15L, 13L, 12L, 11L, 10L, 9L, 8L, 5L, 6L, 9L, 10L, 12L, 13L, 6L,
8L, 15L, 16L, 15L, 9L, 8L, 7L, 6L, 5L, 18L, 16L, 1L, 14L, 15L,
13L, 12L, 11L, 10L, 9L, 13L, 15L, 16L, 17L, 18L, 19L, 20L, 22L,
17L, 16L, 8L), decile = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L,
10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L,
5L, 6L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L,
4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 1L, 2L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
8L, 9L, 10L, 1L, 2L, 3L, 4L, 5L, 6L), time = c(2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L, 2016L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L, 2017L,
2017L, 2017L, 2017L, 2017L, 2017L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L)), .Names = c("name", "amount", "decile",
"time"), row.names = c(NA, -78L), class = c("tbl_df", "tbl",
"data.frame"), spec = structure(list(cols = structure(list(name = structure(list(), class = c("collector_character",
"collector")), amount = structure(list(), class = c("collector_integer",
"collector")), decile = structure(list(), class = c("collector_integer",
"collector")), time = structure(list(), class = c("collector_integer",
"collector"))), .Names = c("name", "amount", "decile", "time"
)), default = structure(list(), class = c("collector_guess",
"collector"))), .Names = c("cols", "default"), class = "col_spec"))
I wish to produce a summary table detailing a count of the number of items by decile. One solution is listed below:
data.frame(table(df$decile))
Alongside this, I want to add an additional column which records the percentage of 'names' (essential rows) which have an 'amount' greater or equal to 10 BY decile.
Any ideas how to complete my coding?
Upvotes: 0
Views: 47
Reputation: 1445
If you don't mind using tidyverse:
library(tidyverse)
df %>% group_by(decile) %>% summarize(mean(amount>=10))
decile `mean(amount >= 10)`
<int> <dbl>
1 1 0.556
2 2 0.444
3 3 0.556
4 4 0.667
5 5 0.667
6 6 0.667
7 7 0.5
8 8 0.333
9 9 0.667
10 10 0.667
Calculates per decile how many rows have an amount
value >= 10. Is that what you want? (do you mean percentage of "rows" rather than "names"?)
Upvotes: 1