Soldalma
Soldalma

Reputation: 4758

How do I convert missing values into strings?

I have a Deedle DataFrame of type Frame<int,string> that contains some missing values. I would like to convert the missing values into empty strings "". I tried to use the valueOr function but that did not help. Is there a way to do this?

Here is my DataFrame:

let s1 = Series.ofOptionalObservations [ 1 => Some("A"); 2 => None ]
let s2 = Series.ofOptionalObservations [ 1 => Some("B"); 2 => Some("C") ]
let df = Frame.ofColumns ["A", s1; "BC", s2]

Typing df;; in FSI yields some information including

ColumnTypes = seq [System.String; System.String];. So the values of df are of type string and not string option.

This is the function valueOr:

let valueOr (someDefault: 'a) (xo: 'a option) : 'a =
    match xo with
    | Some v -> v
    | None -> someDefault

I defined an auxiliary function emptyFoo as:

let emptyFoo = valueOr ""

The signature of emptyFoo is string option -> string. This means emptyFoo should not be acceptable to the compiler in the following command:

let df' = Frame.mapValues emptyFoo df

This is because the values of df are of type string and not string option.

Still, the compiler does not complain and the code runs. However, df' still has a missing value.

Is there a way to transform the missing value into the empty string?

Upvotes: 3

Views: 97

Answers (1)

TheQuickBrownFox
TheQuickBrownFox

Reputation: 10624

The Deedle documentation for Frame.mapValues:

Builds a new data frame whose values are the results of applying the specified function on these values, but only for those columns which can be converted to the appropriate type for input to the mapping function

So the mapping does nothing because strings are found, rather than string options.

I noticed another function that seems to do exactly what you want.

let df' = Frame.fillMissingWith "" df

The key thing I noticed was that Deedle shows those missing values as <missing>, suggesting that it uses it's own representation (as opposed to option for example). With that knowledge I guessed that the library would provide some way of manipulating missing values, so I explored the API by doing Frame. in my IDE and browsing the list of available functions and their documentation.

Upvotes: 4

Related Questions