user6283344
user6283344

Reputation:

CsvProvider throws OutOfMemoryException

FAOCropsLivestock.csv contains more than 14 million row. In my .fs file I have declared

type FAO = CsvProvider<"c:\FAOCropsLivestock.csv">

and tried to work with follwoing code

FAO.GetSample().Rows.Where(fun x -> x.Country = country) |> ....
FAO.GetSample().Filter(fun x -> x.Country = country) |> ....

In both cases, exception was thrown.

I also have tried with follwoing code after loading the csv file in MSSQL Server

type Schema = SqlDataConnection<conStr>
let db = Schema.GetDataContext()
db.FAOCropsLivestock.Where(fun x-> x.Country = country) |> ....

it works. It also works if I issue query using OleDb connection, but it is slow.

How can I get a squence out of it using CsvProvider?

Upvotes: 2

Views: 107

Answers (1)

TheInnerLight
TheInnerLight

Reputation: 12184

If you refer to the bottom of the CSV Type Provider documentation, you will see a section on handling large datasets. As explained there, you can set CacheRows = false which will aid you when it comes to handling large datasets.

type FAO = CsvProvider<"c:\FAOCropsLivestock.csv", CacheRows = false>

You can then use standard sequence operations over the rows of the CSV as a sequence without loading the entire file into memory. e.g.

FAO.GetSample().Rows |> Seq.filter (fun x -> x.Country = country) |> ....

You should, however, take care to only enumerate the contents once.

Upvotes: 5

Related Questions