tesla1060
tesla1060

Reputation: 2765

F# read zipped csv file

is that possible to use F# deedle to read zipped csv directly like the read_csv function in pandas? if this is not possible, is that possible to use csv type provider to do this ?

Upvotes: 3

Views: 308

Answers (2)

s952163
s952163

Reputation: 6324

Why do you need to read the zipfile csv directly? You can always access the file(s) with System.IO.Compression and then feed it to Deedle or the CSVProvider or even FileHelper:

open System.IO.Compression  
open System.IO

let zipfile =  @"C:\tmp\zipFile1.zip"

let unzip (zipfile:string) =
    let zipf = new FileStream(zipfile,FileMode.Open,FileAccess.Read)
    let zip  = new ZipArchive(zipf)
    zip

let unzipFile = unzip zipfile
let stream = new StreamReader(unzipFile.GetEntry("zipFile1.csv").Open())  
let txt = stream.ReadToEnd()

If your input can take a stream (like the above libraries), then this utility function will do it (using OpenRead directly on the zipfile):

//string * string -> StreamReader
let getFromZip(entry,zip) =
    ZipFile.OpenRead(zip)
        |> (fun x -> x.GetEntry(entry))
        |> (fun x -> new StreamReader(x.Open()))

You might also need to reference System.IO.Compression.FileSystem, but no need to open it.

Upvotes: 1

marklam
marklam

Reputation: 5358

If you use the ICSharpCode.SharpZipLib NuGet package, you can read the CSV from the zip with Deedle like this:

open ICSharpCode.SharpZipLib.Zip
open System.IO
open Deedle

[<EntryPoint>]
let main argv = 
    use fs = new FileStream(@"mycsv.zip", FileMode.Open, FileAccess.Read)
    use zip = new ZipFile(fs)
    use csv = zip.GetInputStream(0L)
    let frame = Frame.ReadCsv(csv)

Upvotes: 3

Related Questions