ca9163d9
ca9163d9

Reputation: 29179

Get the column type information from CsvProvider?

I have the following code to get the type information of a CSV file. How to get the type information of columns? I will need to save it to a database table.

open FSharp.Data

type MyFile = CsvProvider<"""C:\temp\sample.csv""">

[<EntryPoint>]
let main argv = 
    let myFile = MyFile.Load("""C:\temp\sample.csv""")

    printfn "%A" ((myFile.Rows |> Seq.head).GetType())
    // Write the type information of myFile columns to a table

    for row in myFile.Rows do
        printfn "%A" row
    0 

The function ((myFile.Rows |> Seq.head).GetType()) returns embedded tuples of basic F# types and the header names are missing.

System.Tuple`8[System.Int32,System.Int32,System.String,System.Int32,System.Int32
,System.String,System.String,System.Tuple`8[System.Int32,System.String,System.De
cimal,System.Decimal,System.Decimal,System.Decimal,System.Int32,System.Tuple`8[S
ystem.Decimal,System.Decimal,System.Decimal,System.Nullable`1[System.Int32],Syst
em.String,System.Boolean,System.Int32,System.Tuple`8[System.Decimal,System.Int32
,System.Int32,System.Decimal,System.Int32,System.Nullable`1[System.Int32],System
.Int32,System.Tuple`8[System.Decimal,System.Nullable`1[System.Int32],System.Null
able`1[System.Int32],System.Nullable`1[System.Int32],System.Decimal,System.Decim
al,System.String,System.Tuple`8[System.String,System.String,System.String,System
.String,System.String,System.String,System.String,System.Tuple`8[System.String,S
ystem.String,System.String,System.String,System.String,System.String,System.Null
able`1[System.Int32],System.Tuple`8[System.String,System.String,System.Nullable`
1[System.Int32],System.String,System.String,System.String,System.String,System.T
uple`8[System.String,System.String,System.String,System.String,System.String,Sys
tem.String,System.String,System.Tuple`1[System.String]]]]]]]]]]

Expected output,

ColumnA int
ColumnB datetime
ColumnC varchar
....

Upvotes: 0

Views: 449

Answers (1)

Sven Grosen
Sven Grosen

Reputation: 5636

I am sure someone can provide a more idiomatic way to organize this, but this should at least work (note also I am explicitly not doing any exception handling and accessing the value of a string [] option value (Headers)). Parameters are on new lines for formatting purposes, FYI.:

let rec iterateTupleMemberTypes (tupleArgTypes: System.Type[]) 
    (columnNames: string[]) 
    (startingIndex : int) =
    let mutable index = startingIndex
    for t in tupleArgTypes do
        match t.IsGenericType with
        | true -> iterateTupleMemberTypes (t.GetGenericArguments()) columnNames index
        | false ->
            printfn "Name: %s Type: %A" (columnNames.[index]) t
            index <- index + 1

And call it like this:

let firstRow = MyFile.Rows |> Seq.head
let tupleType = firstRow.GetType()
let tupleArgTypes = tupleType.GetGenericArguments()
iterateTupleMemberTypes tupleArgTypes MyFile.Headers.Value 0

The recursive nature of iterateTupleMemberTypes is necessary because once your tuple gets to a certain number of "members" the last member is used to stuff all the remaining members in a tuple of its own. In my testing, this happened once I hit 8 members of the tuple.

EDIT

OP asked in comments about how to modify iterateTupleMemberTypes to build up a collection of type/name pairs, well here that is (I decided to just put them as tuples):

let iterateTupleMemberTypes (tupleArgTypes: System.Type[]) (columnNames: string[]) =
    let rec iterateRec (argTypes: System.Type list) (values) (index) =
        match argTypes with
        | [] -> List.rev values
        | head :: tail when head.IsGenericType -> 
            iterateRec (List.ofArray (head.GetGenericArguments())) values index
        | head :: tail -> 
            iterateRec tail ((head, columnNames.[index])::values) (index + 1)
    iterateRec (List.ofArray tupleArgTypes) List.empty 0

Call it like this:

let tupleType = firstRow.GetType()
let tupleArgTypes = tupleType.GetGenericArguments()
let schemaStuff = iterateTupleMemberTypes tupleArgTypes MyFile.Headers.Value

And as an additional bonus method, here's how you can iterate through those resulting tuples:

let rec printSchemaMembers (schema:(System.Type*string) list) =
    match schema with
    | (argType, name)::tail ->
        printfn "Type: %A, Name: %s" argType name
        printSchemaMembers tail
    | [] -> ignore

Upvotes: 1

Related Questions