Reputation: 11927
The following .fsx file is supposed to download and save to disk binary table base files which are posted as links in a html page on the internet, using Fsharp.Data.dll
.
What happens, is that the whole thing stalls after a while and way before it is done, not even throwing an exception or alike.
I am pretty sure, I kind of mis-handle the CopyToAsync()
thingy in my async workflow. As this is supposed to run while I go for a nap, it would be nice if someone could tell me how it is supposed to be done correctly. (In more general terms - how to handle a System.Threading.Task thingy in an async workflow thingy?)
#r @"E:\R\playground\DataTypeProviderStuff\packages\FSharp.Data.2.2.3\lib\net40\FSharp.Data.dll"
open FSharp.Data
open Microsoft.FSharp.Control.CommonExtensions
let document = HtmlDocument.Load("http://www.olympuschess.com/egtb/gaviota/")
let links =
document.Descendants ["a"] |> Seq.choose (fun x -> x.TryGetAttribute("href") |> Option.map (fun a -> a.Value()))
|> Seq.filter (fun v -> v.EndsWith(".cp4"))
|> List.ofSeq
let targetFolder = @"E:\temp\tablebases\"
let downloadUrls =
links |> List.map (fun name -> "http://www.olympuschess.com/egtb/gaviota/" + name, targetFolder + name )
let awaitTask = Async.AwaitIAsyncResult >> Async.Ignore
let fetchAndSave (s,t) =
async {
printfn "Starting with %s..." s
let! result = Http.AsyncRequestStream(s)
use fileStream = new System.IO.FileStream(t,System.IO.FileMode.Create)
do! awaitTask (result.ResponseStream.CopyToAsync(fileStream))
printfn "Done with %s." s
}
let makeBatches n jobs =
let rec collect i jl acc =
match i,jl with
| 0, _ -> acc,jl
| _, [] -> acc,jl
| _, x::xs -> collect (i-1) (xs) (acc @ [x])
let rec loop remaining acc =
match remaining with
| [] -> acc
| x::xs ->
let r,rest = collect n remaining []
loop rest (acc @ [r])
loop jobs []
let download () =
downloadUrls
|> List.map fetchAndSave
|> makeBatches 2
|> List.iter (fun l -> l |> Async.Parallel |> Async.RunSynchronously |> ignore )
|> ignore
download()
Note Updated code so it creates batches of 2 downloads at a time and only the first batch works. Also added the awaitTask from the first answer as this seems the right way to do it.
News What is also funny: If I interrupt the stalled script and then #load it again into the same instance of fsi.exe, it stalls right away. I start to think it is a bug in the library I use or something like that.
Thanks, in advance!
Upvotes: 3
Views: 293
Reputation: 2291
Here fetchAndSave has been modified to handle the Task returned from CopyToAsync asynchronously. In your version you are waiting on the Task synchronously. Your script will appear to lock up as you are using Async.RunSynchronously to run the whole workflow. However the files do download as expected in the background.
let awaitTask = Async.AwaitIAsyncResult >> Async.Ignore
let fetchAndSave (s,t) = async {
let! result = Http.AsyncRequestStream(s)
use fileStream = new System.IO.FileStream(t,System.IO.FileMode.Create)
do! awaitTask (result.ResponseStream.CopyToAsync(fileStream))
}
Of course you also need to call
do download()
on the last line of your script to kick things into motion.
Upvotes: 2