Reputation: 196
I'm relatively new to haskell, but in my searching I couldn't find an easy way to conditionally fold a list. i.e. When an element satisfies a condition (like in filter
) to fold that element by a function (like foldr
and foldl
).
My workaround was to write the following helper function, then apply map
to change the resulting list of pairs as my situation required.
-- This function returns tuples containing the elements which
-- satisfy `cond` folded right, adding 1 to the second value
-- in each pair. (`snd` pair starts at 0)
-- Condition takes a single value (similar to `filter`)
-- NOTE: list cannot end with token
foldrOn cond list =
if (length list) > 0 then
if cond (head list) then
do
let tmp = foldrOn cond (tail list)
(fst (head tmp), snd (head tmp) + 1) : (tail tmp)
-- fold token into char after it
else
(head list, 0) : (foldrOn cond (tail list))
-- don't fold token
else
[] -- base case len list = 0
foldlOn cond list = ...
For example, the use-case would be something along the lines of wanting to remove the zeros in the following lists but remember how many were removed between each value.
-- the second value in each resultant pair represents the number of
-- zeroes preceding the corresponding first value in the original list.
foldrOn (== 0) [1,0,0,0,0,0,1,0,0,0,1] -- [(1,0),(1,5),(1,3)]
foldrOn (== 0) [1,0,0,12,0,13] -- [(1,0),(12,2),(13,1)]
Is there a better way to accomplish this? Additionally, can this be done more optimally?
Upvotes: 3
Views: 999
Reputation: 48591
Let's start by using pattern matching to make your own implementation more idiomatic, more obviously correct, and also (much) faster. We can also use guards in an idiomatic fashion rather than if
/then
/else
; this is rather less important. There's also no reason to use do
here, so we won't.
foldrOn _cond [] = []
foldrOn cond (hd : tl)
| cond hd
= case foldrOn cond tl of
(x, y) : tl' -> (x, y + 1) : tl'
-- fold token into char after it
[] -> error "String ended on token."
| otherwise
= (hd, 0) : foldrOn cond tl
-- don't fold token
This is ... okay. But as Will Ness suggests, we don't actually gain anything by consing an "incomplete" element onto the result list. We can instead count up the cond
-satisfying tokens until we reach the end of the block, and then produce a complete element. I think this makes the code a little easier to understand, and it should also run a little bit faster.
foldrOn cond = go 0
where
go count (hd : tl)
| cond hd
= go (count + 1) tl -- Don't produce anything; just bump the count
| otherwise
= (hd, count) : go 0 tl -- Produce the element and the count; reset the count to 0
go count []
| count == 0
= []
| otherwise
= error "List ended on a token."
To actually run faster, you might need to tell the compiler explicitly that you really want to calculate the counts. You probably don't need to understand this part just yet, but it looks like this:
-- At the top of the file, add this line:
{-# LANGUAGE BangPatterns #-}
foldrOn cond = go 0
where
go !count (hd : tl)
| cond hd
= go (count + 1) tl -- Don't produce anything; just bump the count
| otherwise
= (hd, count) : go 0 tl -- Produce the element and the count; reset the count to 0
go count []
| count == 0
= []
| otherwise
= error "List ended on a token."
This can be written as a fold in the manner Will Ness demonstrates.
Note: while it's possible to avoid the BangPatterns
language extension, doing so is a bit annoying.
Upvotes: 2
Reputation: 71075
First of all,
foldrOn :: Num t => (a -> Bool) -> [a] -> [(a, t)]
-- foldrOn (== 0) [1,0,0,0,0,0,1,0,0,0,1] -- [(1,0),(1,5),(1,3)]
foldrOn p xs = foldr g [] xs
where
g x [] = [(x,0)]
g x ((y,n):r)
| p x = ((y,n+1):r)
g x r = ((x,0):r)
This is the simplest, though it is recursive, i.e. will force the whole list to the end before starting returning its result.
To make it maximally lazy we'd have to use a lazy left fold. The skipping over the p
-satisfying elements is still a recursive step, but at least the process will pause between each such span.
Lazy left fold is usually implemented as a foldr
with additional argument being passed left to right along the list:
foldlOn :: Num t => (a -> Bool) -> [a] -> [(a, t)]
-- foldlOn (== 0) [1,0,0,0,0,0,1,0,0,0,1] -- [(1,0),(1,5),(1,3)]
foldlOn p xs = foldr g z xs 0
where
g x r i | p x = r (i+1)
| otherwise = (x,i) : r 0
z _i = []
Or you could combine span
/break
and unfoldr
to do the same.
You might find a way to use groupBy
with some post-processing step:
GHCi> groupBy (\a b -> (==0) b) [1,0,0,0,0,0,1,0,0,0,1]
[[1,0,0,0,0,0],[1,0,0,0],[1]]
GHCi> groupBy (const (==0)) [1,2,0,0,1,0,1]
[[1],[2,0,0],[1,0],[1]]
Finishing this should not be a problem.
Upvotes: 3
Reputation: 7139
You can always bring some builtin machinery. The Data.List
library is quite powerful:
import Data.List(mapAccumL)
import Data.Maybe(catMaybes)
foldrOn cond = catMaybes . snd . mapAccumL combine 0 where
combine a el =
if cond el then (a + 1, Nothing)
else (0, Just (el, a))
Essentially, foldrOn cond
is a composition of the following functions:
mapAccumL combine 0
which advances along the list modifying each element by information about the number of recently skipped entities (starting the count at 0
and resetting it whenever we find something that doesn't match the cond
predicate).snd
which discards the final state from the mapAccumL
's resultcatMaybes
which removes the Maybe
layer and leaves only the "present" values.Upvotes: 3