Reputation: 79
I'm just a beginner in R and I am unable to think a very complex logic. Let me tell u
I have my data
Serial no | Invoice no | Date received |
---|---|---|
0C315 | 17-jun-20 | 03-Aug-20 |
0C8AA | 26-jun-20 | 03-Aug-20 |
0C8B5 | 26-jun-20 | 11-Aug-20 |
0C8C0 | 28-jun-20 | 11-Aug-20 |
0C8CB | 18-jun-20 | 03-Aug-20 |
0C8D6 | 18-jun-20 | 11-Aug-20 |
0C8E1 | 20-jun-20 | 24-Aug-20 |
0C8EC | 24-jun-20 | 24-Aug-20 |
0C8F7 | 26-jun-20 | 24-Aug-20 |
What I have to do is...invoice date is basically manufactured date and date received is the recieved date, the product whose manufactured date is earlier that product must be received earlier, simply first come first served. Now I have to list the details of those products whose manufactured date is earlier but recieved later. For this data... In row 1... invoice date is 17 Jun 20 and received on 3 aug 20 which is fine because 03 aug 20 is the earliest among all rows
In 6 th row (count row no. exclude column name)-> 0C8D6...
invoice date - 18 Jun 20 and
received on 11 aug 20.
Now compare this with evry rows..you will find in the 2nd row
invoice date is 26 Jun 20 but recieved on 3 aug 20(manufactured late and recieved earlier than row 6th)
which is not supposed to be, 6 th row should be received earlier not the 2nd row.So 6 th row should be listed down.
Same for last row
Manufactured on - 26 Jun 20
Recieved on 24 Aug 20
But on comparing, found a row which is 4th row....
mfd on 28 Jun 20
And recieved on 11 aug 20(manufactured late but recieved earlier than last row)
Last row should be listed down
My output should be
Serial no | Invoice no | Date received |
---|---|---|
0C8D6 | 18-jun-20 | 11-Aug-20 |
0C8E1 | 20-jun-20 | 24-Aug-20 |
0C8EC | 24-jun-20 | 24-Aug-20 |
0C8F7 | 26-jun-20 | 24-Aug-20 |
I hope I have made clear what my problem is...let me know if you have any doubt regarding the explanation. Please help me to code the problem
Thanks in advance
Upvotes: 0
Views: 67
Reputation: 66
First, you'll need to get your table into a dataframe/tibble with legal column names. If you're not familiar with dates in R, look up as.Date()
.
Then you can do something like this:
library(dplyr)
is_this_order_late <- function(df, row) {
this_date_ordered <- df[row, "Invoice.no"]
this_date_recieved <- df[row, "Date.received"]
# Orders which were placed after this one but were received earlier
quicker_orders <- df %>%
filter(
Invoice.no > this_date_ordered, Date.received < this_date_recieved
)
# If there were no orders that were placed after this one but received earlier, return FALSE;
# Otherise, return TRUE
is_late <- as.logical(nrow(quicker_orders))
return(is_late)
}
df[sapply(1:nrow(df), function(row) {is_this_order_late(df, row)}),]
I made a function that takes in a row and determines whether or not the order corresponding to that row was received late (according to the condition you specified). Then I applied that function to every row in your dataframe using sapply()
, and then filtered the dataframe using a boolean mask.
If you're not already familiar with them, you may want to look up boolean masks, dplyr
's filter()
, and the sapply()
/lapply()
family of functions in R.
Upvotes: 1