user11058887
user11058887

Reputation: 39

Why does Dplyr group_by not respect .drop=FALSE

I have a very simple data count problem where I am grouping by two fields (MktDate,Indexname) which are date and chr types respectively. I have a few indexes with no data on certain dates and all I'm looking for my code to do is to include a 0 on those dates. I have tried a number of different solutions following dplyr's group_by documentation but I can not get this short piece of code to return a value of 0 when no data is present. Here is my code

IdxStats <- IdxData %>% 
  group_by(MktDate,IndexName,.drop=FALSE) %>% 
  summarize(CountSecurity = length(MktDate)) %>% 
  ungroup

Upvotes: 2

Views: 1165

Answers (1)

StupidWolf
StupidWolf

Reputation: 46908

The .drop is correct, but when you use length, the data.frame doesn't exist at all, so length is it, will be weird. Try this below:

set.seed(100)
IdxData = data.frame(MktDate=sample(1:3,10,replace=TRUE),
                     IndexName=sample(LETTERS[1:3],10,replace=TRUE))

IdxData %>% count(MktDate,IndexName,.drop=FALSE)
# A tibble: 9 x 3
  MktDate IndexName     n
    <int> <fct>     <int>
1       1 A             0
2       1 B             0
3       1 C             1
4       2 A             1
5       2 B             1
6       2 C             4
7       3 A             0
8       3 B             2
9       3 C             1

Or if you need the name "CountSecurity" (thanks to @arg0naut91 ) :

IdxData %>% 
count(MktDate,IndexName,.drop=FALSE,name="CountSecurity")

Upvotes: 3

Related Questions