Reputation: 197
I have the following data:
Data <- data.frame(Project=c(123,123,123,123,123,123,123,123,124,124,124,124,124,125,125),
Value=c(1,2,3,4,7,3,8,9,8,3,2,5,6,2,3),
OldValue=c("","Open","In Progress","Open","In Progress","Complete","Open","In Progress","Complete","Open","In Progress","System Declined","In Progress","","Open"),
NewValue=c("Open","In Progress","Open","In Progress","Complete","Open","In Progress","Complete","Open","In Progress","System Declined","In Progress","Complete","Open","Complete"))
Data$Request <- ifelse(((Data$OldValue==""|Data$OldValue=="Complete"|Data$OldValue=="System Declined"|Data$OldValue=="In Progress")&Data$NewValue=="Open"),Data$Value,NA)
Data$Start <- ifelse(((Data$OldValue=="Open"|Data$OldValue=="Complete"|Data$OldValue=="System Declined")&Data$NewValue=="In Progress"),Data$Value,NA)
Data$End <- ifelse(((Data$NewValue=="Complete"|Data$NewValue=="System Declined")&(Data$OldValue=="Open"|Data$OldValue=="In Progress")),Data$Value,NA)
I want to be able to determine the number of completed projects with their associated values by unique Project ID. A completed project is one that has both the "Request" and "End" fields populated ("Start" is not a required field).
I'd like to obtain the information using the following criteria:
I'm only interested in capturing the first instance of the "Request", "Start" and "End" fields. Example: top 5 lines of code show two "Request" values, two "Start" values, and one "End" value. I'd want to merge the first instance of each field into one row, so the result would be 1,2,7.
In the same example as above, I'd want the 1,2,7 values to be merged on the first instance of the "NewValue" being "Open", not the second, and not both.
There can be multiple completed projects per uneir Project ID. Example: Project ID 123 should have two completed projects: values 1,2,7 and 3,8,9
This is the result I'm looking for:
Result <- data.frame(Project=c(123,123,124,125),
Value=c(1,3,8,2),
OldValue=c("","Complete","Complete",""),
NewValue=c("Open","Open","Open","Open"),
Request=c(1,3,8,2),
Start=c(2,8,3,""),
End=c(7,9,2,3))
Would really appreciate any help coding this.
Upvotes: 0
Views: 60
Reputation: 2716
it's not clear what the OP wants exactly, but I hope you can tweak this code to obtain the desired result
require(data.table)
Data <- data.table(Data)
Data2 <- Data[,.(OldValue, NewValue,Request = first(Request[!is.na(Request)]),
Start = first(Start[!is.na(Start)]),
End = first(End[!is.na(End)])),
.(Project, Value)]
Upvotes: 1