letsjak
letsjak

Reputation: 359

Extract Integers from file

If my file (csv) looks like this:

John,12323,New York, 2233

I read the file with:

contents <- readFile "data.csv"

My result is a String which I split with splitOn:

["John","12323","New York","2233"]

How can I filter only the numbers from this list?

filter (=~ "regex") resultList

I already tried it with the filter method but it does not work.

This is what I want to achieve:

[12323,2233]

Upvotes: 0

Views: 156

Answers (3)

effectfully
effectfully

Reputation: 12715

import Data.Char

isInteger = all isDigit

onlyIntegers :: [String] -> [Integer]
onlyIntegers = map read . filter isInteger

Upvotes: 4

bheklilr
bheklilr

Reputation: 54058

You could use regex, but that's going to be error prone and slow. You'd essentially have to parse each number twice. Instead, you can use a relatively simple solution combining built-in functions:

import Text.Read (readMaybe)
import Data.Maybe (catMaybes)

extractInts :: [String] -> [Int]
extractInts = catMaybes . map readMaybe

A better solution would be to use a CSV parsing library like Cassava, in which you could write a data structure like

data MyRecord = MyRecord
    { name :: String
    , zipCode :: Int
    , city :: String
    , anotherField :: Int
    } deriving (Eq, Show)

instance FromRecord MyRecord where
    parseRecord v
        | length v == 4
            =   MyRecord
            <$> v .! 0
            <*> v .! 1
            <*> v .! 2
            <*> v .! 3
        | otherwise = mzero

Then you can use the decode functions in Cassava to parse your file for you more efficiently than splitOn will do.

Upvotes: 3

bitemyapp
bitemyapp

Reputation: 1647

Use a CSV parsing library like Cassava: http://hackage.haskell.org/package/cassava

Among other things it has decoding for things like Integers built in with error handling.

I have a 7,000 word post here that is all about CSV parsing if you'd like examples.

Upvotes: 3

Related Questions