Reputation: 41
While trying to apply the regular expression below to each element in a list (test), I get an error. This is what I am trying to do in ghci
import Text.Regex.TDFA
let test = ["<lobbying_firm>The CrisCom Company</lobbying_firm>","<registration_year>2013</registration_year>"]
let regExp = "\\>(.)*\\<"
let result = map (=~ regExp :: String) test
How can I do this? Any ideas?
Upvotes: 3
Views: 94
Reputation: 476493
You are quite close. The only problem is that your type signature will generate some trouble here. The type of (=~ rexExp)
you want is probably String -> String
. Indeed the type of the mapping function is a function that takes a String
as parameter, and returns a String
here. Not a String
itself.
We can thus create a map
with:
result = map ((=~ regExp) :: String -> String) test
This produces:
Prelude Text.Regex.TDFA> map ((=~ regExp) :: String -> String) test
[">The CrisCom Company</",">2013</"]
That being said, I strongly advice not to parse HTML, XML, JSON, etc. with a regex. Indeed regexes can not parse HTML and other recursive languages. This is the consequence of the Pumping lemma for regular languages [wiki]. You can never fully parse HTML. You might indeed parse some sublanguages, etc. But even then the regex will easily get (very) complicated. You therefore better use a library like tagsoup
[hackage], or a scraper library like scalpel
[hackage].
Upvotes: 2