Reputation: 9426
I'm learning F# and I've started to play around with both sequences and match
expressions.
I'm writing a web scraper that's looking through HTML similar to the following and taking the last URL in a parent <span>
with the paging
class.
<html>
<body>
<span class="paging">
<a href="http://google.com">Link to Google</a>
<a href="http://TheLinkIWant.com">The Link I want</a>
</span>
</body>
</html>
My attempt to get the last URL is as follows:
type AnHtmlPage = FSharp.Data.HtmlProvider<"http://somesite.com">
let findMaxPageNumber (page:AnHtmlPage)=
page.Html.Descendants()
|> Seq.filter(fun n -> n.HasClass("paging"))
|> Seq.collect(fun n -> n.Descendants() |> Seq.filter(fun m -> m.HasName("a")))
|> Seq.last
|> fun n -> n.AttributeValue("href")
However I'm running into issues when the class I'm searching for is absent from the page. In particular I get ArgumentExceptions with the message: Additional information: The input sequence was empty.
My first thought was to build another function that matched empty sequences and returned an empty string when the paging
class wasn't found on a page.
let findUrlOrReturnEmptyString (span:seq<HtmlNode>) =
match span with
| Seq.empty -> String.Empty // <----- This is invalid
| span -> span
|> Seq.collect(fun (n:HtmlNode) -> n.Descendants() |> Seq.filter(fun m -> m.HasName("a")))
|> Seq.last
|> fun n -> n.AttributeValue("href")
let findMaxPageNumber (page:AnHtmlPage)=
page.Html.Descendants()
|> Seq.filter(fun n -> n.HasClass("paging"))
|> findUrlOrReturnEmptyStrin
My issue is now that Seq.Empty
is not a literal and cannot be used in a pattern. Most examples with pattern matching specify empty lists []
in their patterns so I'm wondering: How can I use a similar approach and match empty sequences?
Upvotes: 10
Views: 6882
Reputation: 6629
You can use a when
guard to further qualify the case:
match span with
| sequence when Seq.isEmpty sequence -> String.Empty
| span -> span
|> Seq.collect (fun (n: HtmlNode) ->
n.Descendants()
|> Seq.filter (fun m -> m.HasName("a")))
|> Seq.last
|> fun n -> n.AttributeValue("href")
ildjarn is correct in that in this case, an if...then...else
may be the more readable alternative, though.
Upvotes: 13
Reputation: 3932
Use a guard clause
match myseq with
| s when Seq.isEmpty s -> "empty"
| _ -> "not empty"
Upvotes: 5
Reputation: 10624
Building on the answer from @rmunn, you can make a more general sequence equality active pattern.
let (|Seq|_|) test input =
if Seq.compareWith Operators.compare input test = 0
then Some ()
else None
match [] with
| Seq [] -> "empty"
| _ -> "not empty"
Upvotes: 2
Reputation: 36678
The suggestion that ildjarn gave in the comments is a good one: if you feel that using match
would create more readable code, then make an active pattern to check for empty seqs:
let (|EmptySeq|_|) a = if Seq.isEmpty a then Some () else None
let s0 = Seq.empty<int>
match s0 with
| EmptySeq -> "empty"
| _ -> "not empty"
Run that in F# interactive, and the result will be "empty"
.
Upvotes: 16