Omar Gonzales
Omar Gonzales

Reputation: 4008

R - Dplyr - How to add a calculated field based on current data.frame

I'm grouping a data frame by the column "month", and then summarising the "users" column.

Using this code:

Count_Users_By_Month <- Users_By_Month %>% group_by(month) %>% 
  summarise(Users = length(unique(users)))

I get this, that i'm 100% sure it's correct:

     month       Users
1 Diciembre      4916
2 Noviembre      3527

Question 1: How to add a column showing the variation in "Diciembre" based on "Noviembre"?(In percentage %).

Need to create a colum for the variation month to month

The formula (pseudocode) is this one:

(DiciembreUsers-NoviembreUsers)/NoviembreUsers

** Of course the value for Noviembre would be clear cause there is no data from previous month (October).

I tried this code to do this, but get an error:

Count_Users_By_Month <- Users_By_Month %>% group_by(month) %>% 
  summarise(Users = length(unique(users))) %>%
  mutate(Variacion = (Count_Users_By_Month[1,2]-Count_Users_By_Month[2,2])/Count_Users_By_Month[2,2])

Error: not compatible with STRSXP

**Last edit:

Problem solved, Thanks @Khasha. See comments:

Changed "lag" for "lead".... it worked. Just added "lead" to the divison part to get the formula right.

mutate(variation=(Users-lead(Users))/lead(Users))

Upvotes: 2

Views: 6115

Answers (1)

Omar Gonzales
Omar Gonzales

Reputation: 4008

This is the original data frame:

    month       Users
1 Diciembre      4916
2 Noviembre      3527

This is the answer:

Count_Users_By_Month <- Users_By_Month %>% group_by(month) %>% 
                        summarise(Users = length(unique(users))) %>%
                        mutate(variation=(Users-lead(Users))/lead(Users))

Need to investigate how "lead" works. All the credits to @Khashaa, see his answer in comments. Just modified the formula, added "lead" in the division part to get the right answer

Upvotes: 1

Related Questions