Reputation: 2283
I have the following dataframe (DF1), which contains much more rows (this is only the head of the dataframe):
"transactionID" "sequenceID" "eventID" "items"
"1" 29698 40928360 1 "OV4"
"2" 23561 41469527 1 "OV6"
"3" 23562 41469527 2 "OV7"
"4" 10514 41487127 1 "OV8"
"5" 10515 41487127 2 "OV9"
"6" 10516 41487127 3 "OV10"
"7" 10517 41487127 4 "OV11"
I would like to filter the all the rows which their maximal eventID value is over 3. As seen, each eventId is grouped by sequenceID. I would like to keep the order and the content of the rows. The required result would be:
"transactionID" "sequenceID" "eventID" "items"
"4" 10514 41487127 1 "OV8"
"5" 10515 41487127 2 "OV9"
"6" 10516 41487127 3 "OV10"
"7" 10517 41487127 4 "OV11"
Since, only sequenceID = 41487127 contains eventID of 4. The rest contains only 1 or 2 eventID. I used the following command:
> data7<-setDT(DF1)[, if(any(eventID >= 3)) .SD, by = sequenceID ]
> data7
sequenceID transactionID eventID items
1: 41487127 10514 1 OV8
2: 41487127 10515 2 OV9
3: 41487127 10516 3 OV10
4: 41487127 10517 4 OV11
The position of the column transactionID was changed. Why is it? What can I do just to filter the dataframe without changing the column's positions? Is my command is the most efficient way for such filtering?
Upvotes: 0
Views: 96
Reputation: 886998
We are grouping by 'sequenceID', which becomes the first column after filtering. But, we can rearrange using the original column names of the dataset.
setDT(DF1)[,if(any(eventID>3)) .SD , sequenceID][,names(DF1),with=FALSE]
# transactionID sequenceID eventID items
#1: 10514 41487127 1 OV8
#2: 10515 41487127 2 OV9
#3: 10516 41487127 3 OV10
#4: 10517 41487127 4 OV11
Suppose, we need the position of the rows that were filtered, we create a row names column ('rn') using keep.rownames
and then do the filter.
setDT(DF1, keep.rownames=TRUE)[, if(any(eventID>3)) .SD , sequenceID
][, names(DF1), with=FALSE]
# rn transactionID sequenceID eventID items
#1: 4 10514 41487127 1 OV8
#2: 5 10515 41487127 2 OV9
#3: 6 10516 41487127 3 OV10
#4: 7 10517 41487127 4 OV11
Upvotes: 1