Reputation: 865
I am having a data frame in R as follows:
structure(list(firstpurchaseMonth = c("January 2019", "January 2019",
"January 2019", "January 2019", "January 2019", "January 2019",
"February 2019", "February 2019", "February 2019", "February 2019",
"February 2019", "March 2019", "March 2019", "March 2019", "March 2019",
"April 2019", "April 2019", "April 2019", "May 2019", "May 2019",
"June 2019"), purchaseMonth = c("April 2019", "February 2019",
"January 2019", "June 2019", "March 2019", "May 2019", "April 2019",
"February 2019", "June 2019", "March 2019", "May 2019", "April 2019",
"June 2019", "March 2019", "May 2019", "April 2019", "June 2019",
"May 2019", "June 2019", "May 2019", "June 2019"), monthlyRevenue = c(42455,
65953, 559367, 5218, 29570, 56850, 20263, 502867, 2994, 31974,
43023, 35512, 1873, 601097, 34865, 784367, 9246, 57395, 18594,
1849594, 105371)), row.names = c(NA, -21L), class = "data.frame")
The column firstpurchaseMonth is ordered based on the vector month
month <- c("January 2019" , "February 2019", "March 2019", "April 2019", "May 2019", "June 2019", "July 2019")
monthlyRevenue <- monthlyRevenue[order(match(monthlyRevenue$firstpurchaseMonth, month)), ]
I need to order the purchaseMonth
column based on the month
where the order of firstpurchaseMonth
remains the same.
For instance, with the purchaseMonth as January 2019, I need to order purchaseMonth from Jan to June. Similarly, for Feb, the order will be from Feb to June.
This is the output format of the data frame (Month name will be the same as provided in the month vector -- January 2019, February 2019 etc)
Can anyone provide a solution to achieve this?
Upvotes: 0
Views: 58
Reputation: 388807
You could convert the column values to date objects so that you don't have to manage an external vector month
.
library(dplyr)
monthlyRevenue %>%
mutate(across(c(firstpurchaseMonth, purchaseMonth), zoo::as.yearmon)) %>%
arrange(firstpurchaseMonth, purchaseMonth)
# firstpurchaseMonth purchaseMonth monthlyRevenue
#1 Jan 2019 Jan 2019 559367
#2 Jan 2019 Feb 2019 65953
#3 Jan 2019 Mar 2019 29570
#4 Jan 2019 Apr 2019 42455
#5 Jan 2019 May 2019 56850
#6 Jan 2019 Jun 2019 5218
#7 Feb 2019 Feb 2019 502867
#8 Feb 2019 Mar 2019 31974
#9 Feb 2019 Apr 2019 20263
#10 Feb 2019 May 2019 43023
#11 Feb 2019 Jun 2019 2994
#12 Mar 2019 Mar 2019 601097
#13 Mar 2019 Apr 2019 35512
#14 Mar 2019 May 2019 34865
#15 Mar 2019 Jun 2019 1873
#16 Apr 2019 Apr 2019 784367
#17 Apr 2019 May 2019 57395
#18 Apr 2019 Jun 2019 9246
#19 May 2019 May 2019 1849594
#20 May 2019 Jun 2019 18594
#21 Jun 2019 Jun 2019 105371
Upvotes: 1
Reputation: 26218
Do you want this?
First convert two dates column into ordered factors based on your vector, month.
df$firstpurchaseMonth <- factor(df$firstpurchaseMonth, levels = month, ordered = T)
df$purchaseMonth <- factor(df$purchaseMonth, levels = month, ordered = T)
Thereafter just dplyr::arrange
would work.
library(tidyverse)
df %>% arrange(firstpurchaseMonth, purchaseMonth)
firstpurchaseMonth purchaseMonth monthlyRevenue
1 January 2019 January 2019 559367
2 January 2019 February 2019 65953
3 January 2019 March 2019 29570
4 January 2019 April 2019 42455
5 January 2019 May 2019 56850
6 January 2019 June 2019 5218
7 February 2019 February 2019 502867
8 February 2019 March 2019 31974
9 February 2019 April 2019 20263
10 February 2019 May 2019 43023
11 February 2019 June 2019 2994
12 March 2019 March 2019 601097
13 March 2019 April 2019 35512
14 March 2019 May 2019 34865
15 March 2019 June 2019 1873
16 April 2019 April 2019 784367
17 April 2019 May 2019 57395
18 April 2019 June 2019 9246
19 May 2019 May 2019 1849594
20 May 2019 June 2019 18594
21 June 2019 June 2019 105371
Upvotes: 0