Reputation: 11321
I'm trying to make a simple Haskell program that will take any line that looks like someFilenameHere0035.xml
and returns 0035
. My sample input file, input.txt, would look like this:
someFilenameHere0035.xml
anotherFilenameHere4465.xml
And running: cat input.txt | runhaskell getID.hs
should return:
0035
4465
I'm having so much difficulty figuring this out. Here's what I have so far:
import Text.Regex.PCRE
getID :: String -> [String]
getID str = str =~ "([0-9]+)\\.xml" :: [String]
main :: IO ()
main = interact $ unlines . getID
But I get an error message I don't understand at all:
• No instance for (RegexContext Regex String [String])
arising from a use of ‘=~’
• In the expression: str =~ "([0-9]+)\\.xml" :: [String]
In an equation for ‘getID’:
getID str = str =~ "([0-9]+)\\.xml" :: [String] (haskell-stack-ghc)
I feel like I'm really close, but I don't know where to go from here. What am I doing wrong?
Upvotes: 0
Views: 107
Reputation: 756
First off you only want the number part so we can get rid of the \\.xml
.
The regex-pcre library defines an instance for RegexContext Regex String String
but not RegexContext Regex String [String]
hence the error.
So if we change the type signature to String -> String
then that error is taken care of.
unlines
expects [String] so to test what we had at this point I wrote a quick function that wraps its argument in a list (there's probably a nicer way to do that but that's not the point of the question):
toList :: a -> [a]
toList a = [a]
Running your command with main = interact $ unlines . toList . getID
output 0035, so we're almost there.
getID
is passed a String of the file contents, these are conveniently separated by the \n
character. So we can use splitOn "\n"
from the Data.List.Split library to get our list of .xml files.
Then we simply need to map getID
over that list (toList
is no longer needed).
This gives us:
import Text.Regex.PCRE
import Data.List.Split
getID :: String -> String
getID str = str =~ "([0-9]+)"
main :: IO ()
main = interact $ unlines . map getID . splitOn "\n"
This gives me the desired output when I run your command.
Hopefully this helps :)
Upvotes: 1