Ozgur Alptekın
Ozgur Alptekın

Reputation: 505

Create new variable with most recent date by group in dplyr

I have a dataframe that ı want to create 2 new variable with that group by Id. first I need to group by Id and get most recent date by createdDatethen I need to get Lead_DataSource__cagain based on most recent date

Here is tail of my dataframe;

tail(df)


Id                  CreatedDate Lead_DataSource__c   StageName
0011000001XW3YZAA1  2020-07-17            Walk in   Quotation
0011000001XW3Z8AAL  2020-07-17            Walk in   Quotation
0011000001XW3zHAAT  2020-07-17            Walk in    Assigned
0011000001XW3zlAAD  2020-07-17            Walk in   Quotation
0011000001XW3zvAAD  2020-07-17            Walk in Closed Lost
0011000001XW3zvAAD  2020-07-17            Website Closed Lost

Here is my code for this:

df_new<-df %>% group_by(Id)%>%
 mutate(numberoflead=length(Id)) %>% #number of lead
  mutate(lastcreateddateoflead=max(CreatedDate)) %>%#last date of lead
  mutate(lasttouch =max(CreatedDate)[Lead_DataSource__c])%>% #last touch

when I run these codes I didn't get any error it seems work for numberofleads and lastcreateddateoflead but it doesn't seem to work for getting lasttouch

Could anyone help me about what I m missing here?

Upvotes: 0

Views: 337

Answers (2)

Luis Chaves Rodriguez
Luis Chaves Rodriguez

Reputation: 544

Hey I think I get what you are trying to do but it may not be very properly stated. I believe you want the max date per Id and then max date per Lead_DataSource__c, if that is what you want to do maybe try:

df_new <- df %>% group_by(Id) %>%
 mutate(numberoflead=length(Id)) %>% #number of lead
 mutate(lastcreateddateoflead=max(CreatedDate)) %>%
 group_by(Lead_DataSource__c) %>% 
 mutate(lasttouch =max(CreatedDate)) %>%
 ungroup()

Let me know if that's what you tried to accomplish!

Upvotes: 1

Ben Norris
Ben Norris

Reputation: 5747

Your problem is that you are using mutate when you should be using summarize. You need to then join the original df to get lasttouch. If you add a select inside the join, you get just get the lasttouch column without renaming or selecting anything .

library(dplyr)

df %>%
  group_by(Id) %>%
  summarize(numberoflead = n(),
            lastcreateddateoflead=max(CreatedDate)) %>%
  inner_join(df %>% 
               select(Id, CreatedDate, lasttouch = Lead_DataSource__c),
             by = c("Id" = "Id", "lastcreateddateoflead" = "CreatedDate"))
            
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 6 x 4
  Id                 numberoflead lastcreateddateoflead lasttouch
  <chr>                     <int> <date>                <chr>    
1 0011000001XW3YZAA1            1 2020-07-17            Walk in  
2 0011000001XW3Z8AAL            1 2020-07-17            Walk in  
3 0011000001XW3zHAAT            1 2020-07-17            Walk in  
4 0011000001XW3zlAAD            1 2020-07-17            Walk in  
5 0011000001XW3zvAAD            2 2020-07-17            Walk in  
6 0011000001XW3zvAAD            2 2020-07-17            Website  

If you want to keep all rows (and not just one summary per Id), then use your mutate instead of my summarize.

df %>%
  group_by(Id) %>%
  mutate(numberoflead = n(),
            lastcreateddateoflead=max(CreatedDate)) %>%
  inner_join(df %>% 
               select(Id, CreatedDate, lasttouch = Lead_DataSource__c),
             by = c("Id" = "Id", "lastcreateddateoflead" = "CreatedDate"))

# A tibble: 8 x 7
# Groups:   Id [5]
  Id             CreatedDate Lead_DataSource_~ StageName  numberoflead lastcreateddateofl~ lasttouch
  <chr>          <date>      <chr>             <chr>             <int> <date>              <chr>    
1 0011000001XW3~ 2020-07-17  Walk in           Quotation             1 2020-07-17          Walk in  
2 0011000001XW3~ 2020-07-17  Walk in           Quotation             1 2020-07-17          Walk in  
3 0011000001XW3~ 2020-07-17  Walk in           Assigned              1 2020-07-17          Walk in  
4 0011000001XW3~ 2020-07-17  Walk in           Quotation             1 2020-07-17          Walk in  
5 0011000001XW3~ 2020-07-17  Walk in           Closed Lo~            2 2020-07-17          Walk in  
6 0011000001XW3~ 2020-07-17  Walk in           Closed Lo~            2 2020-07-17          Website  
7 0011000001XW3~ 2020-07-17  Website           Closed Lo~            2 2020-07-17          Walk in  
8 0011000001XW3~ 2020-07-17  Website           Closed Lo~            2 2020-07-17          Website 

Upvotes: 1

Related Questions