Reputation: 629
I have a Deedle Data frame that looks like this.
val it : Frame<int,string> =
Date size1 size2
13 -> 2013-12-12T00:00:00.103336Z 133 35
14 -> 2013-12-12T00:00:00.105184Z 83 35
15 -> 2013-12-12T00:00:00.107205Z 83 35
16 -> 2013-12-12T00:00:00.109566Z 83 34
17 -> 2013-12-12T00:00:00.115260Z 83 34
18 -> 2013-12-12T00:00:00.133546Z 83 34
20 -> 2013-12-12T00:00:00.138204Z 82 34
22 -> 2013-12-12T00:00:00.140125Z 81 34
I would like to remove rows that have the same values for both size1 and size2 as the previous row. In pseudo code...
if row?size1 = prevRow?size1 && row?size2 = prevRow?size2 then dropRow
So in the example above I would end up with:
val it : Frame<int,string> =
Date size1 size2
13 -> 2013-12-12T00:00:00.103336Z 133 35
14 -> 2013-12-12T00:00:00.105184Z 83 35
16 -> 2013-12-12T00:00:00.109566Z 83 34
20 -> 2013-12-12T00:00:00.138204Z 82 34
22 -> 2013-12-12T00:00:00.140125Z 81 34
I believe I want to use
Frame.filterRowValues(row - > )
But I don't see how to compare one row against the previous row. Is there a simple way to do this? Perhaps I need to shift and join?
Upvotes: 1
Views: 620
Reputation: 243106
This can be done using a number of ways and I'm not quite sure which is the best one:
Use shift and join (as you say) would certainly work - you'd need to rename the columns in one of the frames so that you can join them, but it sounds like quite a good solution to me
You can use frame.Rows |> Series.pairwise
to get tuples containing the current and the previous row, then use Series.filter
and Series.map
(to select the second row from the tuple) and re-construct frame using Frame.ofRows
. The only issue is that you'll always lost the first row this way (and you'll have to add it back).
You can use Frame.filter
and find the previous row. The recent release supports Lookup.Smaller
which lets you do that easily.
The code for the third option looks like this (note that the frame rows need to be ordered frame.Rows.IsOrdered = true
) for this to work:
frame |> Frame.filterRows (fun k row ->
let prev = frame.Rows |> Series.tryLookup k Lookup.Smaller // New in v1.0
match prev with
| Some prev -> prev?Something <> row?Something
| _ -> true (* always return true for the first row *) )
Upvotes: 3