Reputation: 413
I have a bunch of categorical ship "Type" data, e.g. passenger, fishing, cargo etc. within different distances offshore (DOS, e.g 0-12 nm, 0-25 nm etc.) for different months of the year.
Initially I want to get a count of the number of Type, e.g. passenger, ships for each DOS for the whole year/data set. I then want to do the same for each month of the year.
I guess this would be some kind of group_by function followed by summaries?? but I am having a hart time trying to get the output as I am not that great using dplyr yet.
some things I have tried:
ships <- df %>% group_by(DOS, Type)
shipc <- summarize(ships, count = n())
df1 <- gather(df, Type, DOS) %>% count(Type, DOS) %>% spread(DOS, n, fill = 0)
But I am pretty sure it is not working, because I do not understand the syntax properly....
Here is some dummy data:
df <- structure(list(Type = c("Cargo ship", "Cargo ship", "Cargo ship",
"Cargo ship", "Cargo ship", "Cargo ship", "Fishing", "Fishing",
"Fishing", "Fishing", "Fishing", "Cargo ship", "Cargo ship",
"Cargo ship", "Cargo ship", "Cargo ship", "Fishing", "Fishing",
"Fishing", "Fishing", "Fishing", "Fishing", "Fishing", "Fishing",
"Fishing", "Cargo ship:DG,HS,MP(A)", "Cargo ship", "Cargo ship",
"Fishing", "Fishing", "Fishing", "Fishing", "Fishing", "Tanker",
"Cargo ship", "Cargo ship", "Fishing", "Fishing", "Cargo
ship:DG,HS,MP(A)",
"Cargo ship:DG,HS,MP(D)", "Cargo ship:DG,HS,MP(D)", "Cargo
ship:DG,HS,MP(D)",
"Cargo ship"), DOS = c("0-100", "0-50", "0-25", "0-100", "0-50",
"0-25", "0-100", "0-25", "0-12", "0-50", "0-100", "0-50", "0-100",
"0-25", "0-50", "0-100", "0-50", "0-25", "0-50", "0-100", "0-25",
"0-100", "0-100", "0-50", "0-25", "0-100", "0-100", "0-50", "0-100",
"0-50", "0-25", "0-100", "0-100", "0-100", "0-50", "0-100", "0-100",
"0-100", "0-100", "0-25", "0-50", "0-100", "0-100"), Month = c("May",
"May", "May", "May", "May", "May", "May", "May", "May", "May",
"June", "June", "June", "June", "June", "June", "June", "June",
"June", "June", "June", "August", "August", "August", "August",
"August", "August", "August", "August", "August", "August", "August",
"January", "January", "January", "January", "January", "January",
"January", "January", "January", "January", "January"), Year = c(2018,
2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018,
2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018,
2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2018, 2019, 2019,
2019, 2019, 2019, 2019, 2019, 2019, 2019, 2019, 2019)), row.names = c(NA,
-43L), class = c("tbl_df", "tbl", "data.frame"))
What I want is the Type category, the DOS and the count of the total ships which fall into those unique identifiers. I then further want that grouped by month and year.
Upvotes: 0
Views: 56
Reputation: 886938
Not clear about the expected. Based on the description, group by all the columns (group_by_all
), get the frequency count (n()
) and spread
to 'wide' format
library(dplyr)
df %>%
group_by_all %>%
summarise(n = n()) %>%
spread(DOS, n, fill = 0)
Or use count
(group_by
+ summarise
) and spread
df %>%
dplyr::count(Type, DOS, Month, Year) %>%
spread(DOS, n, fill = 0)
Upvotes: 1