Reputation: 896
this question might look a little trivial, it does happen in our process as the data is not clean. I have a data frame looks like
let tt = Series.ofObservations[ 1=>10.0; 3=>20.0;5=> 30.0; 6=> 40.0; ]
let tt2 = Series.ofObservations[1=> Double.NaN; 3=> 5.5; 6=>Double.NaN ]
let tt3 = Series.ofObservations[1=> "aaa"; 3=> "bb"; 6=>"ccc" ]
let f1 = frame ["cola" => tt; "colb"=>tt2;]
f1.AddColumn("colc", tt3)
f1.Print();;
cola colb colc
1 -> 10 <missing> aaa
3 -> 20 5.5 bb
5 -> 30 <missing> <missing>
6 -> 40 <missing> ccc
I need to filter out any row until the first row with a value in colb
cola colb colc
3 -> 20 5.5 bb
5 -> 30 <missing> <missing>
6 -> 40 <missing> ccc
The only solution i can come up with is utilising a mutable flag which breaks the integrity of functional programming. maybe a this filtering missing head can be hidden in a library. but it still makes me wonder if i did not do it the right way.
let flag = ref false
let filteredF1 = f1 |> Frame.filterRows(fun k v ->
match !flag, v.TryGetAs<float>("colb") with
| false, OptionalValue.Missing -> flag := false
| false, _ -> flag := true
| true, _ -> ()
!flag
)
This is not really a problem of Deedle but more to do with how should immutability achieve this. Something easily achievable in Python and VBA seems to be very hard to do in F#.
In statistic calculation situation like this happens where multiple serieses have a different starting time. And after the starting point (retaining) the data point containing the missing value is important as missing value means something.
Any advice is appreciated. cassby
Upvotes: 0
Views: 144
Reputation: 476
Here is my preferred way:
// find first index having non-null value in column b
let idx =
f1?colb
|> Series.observationsAll
|> Seq.skipWhile (function | (_, None) -> true | _ -> false)
|> Seq.head
|> fst;;
// slice frame
f1.Rows.[idx .. ];;
Upvotes: 1
Reputation: 8033
If you wrap your code into a function (I modified it a little, but have not tested it at all!!)
let dropTil1stNonMissingB frame =
let flag = ref false
let kernel k v ->
flag := !flag || v.TryGetAs<float>("colb").HasValue
!flag
Frame.filterRows kernel frame
then your code just looks purely functional:
let filteredF1 = f1 |> dropTil1stnonMissingB
As long as the use of reference is restricted to a narrow scope, it should be accepted. Immutability is not the final goal of functional programming. It's only a guiding principle to write a good code.
In fact the Deedle developers should have provided their version of Seq.fold
for Frame
:
Then you could have used it with (new Frame([],[]), false)
as the initial 'State
. Roughly speaking, you should be able to translate any loops in C, Python or whatever imperative language to fold
(aka fold_left
or foldl
), though it isn't necessarily the way to go.
You might as well define it as an extension method of Frame
.
type Frame with
member frame.DropTil1stNonMissingB =
...
let filteredF1 = f1.DropTil1stNonMissingB
Upvotes: 0