haskellfunctional-programminglist-processing

Reputation: 28404

Given a list, how can I perform some transformation only on sub-lists whose each two elements satisfy a binary predicate?

(In my actual use case I have a list of type [SomeType], SomeType having a finite number of constructors, all nullary; in the following I'll use String instead of [SomeType] and use only 4 Chars, to simplify a bit.)

I have a list like this "aaassddddfaaaffddsssadddssdffsdf" where each element can be one of 'a', 's', 'd', 'f', and I want to do some further processing on each contiguous sequence of non-as, let's say turning them upper case and reversing the sequence, thus obtaining "aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD". (I've added the reversing requirement to make it clear that the processing involves all the contiguous non 'a'-s at the same time.)

To turn each sub-String upper case, I can use this:

func :: String -> String
func = reverse . map Data.Char.toUpper

But how do I run that func only on the sub-Strings of non-'a's?

My first thought is that Data.List.groupBy can be useful, and the overall solution could be:

concat $ map (\x -> if head x == 'a' then x else func x)
       $ Data.List.groupBy ((==) `on` (== 'a')) "aaassddddfaaaffddsssadddssdffsdf"

This solution, however, does not convince me, as I'm using == 'a' both when grouping (which to me seems good and unavoidable) and when deciding whether I should turn a group upper case.

I'm looking for advices on how I can accomplish this small task in the best way.

Upvotes: 2

Answers (5)

Will Ness

Reputation: 71065

We can just do what you describe, step by step, getting a clear simple minimal code which we can easily read and understand later on:

foo :: (a -> Bool) -> ([a] -> [a]) -> [a] -> [a]
foo p f xs = [ a
         | g <- groupBy ((==) `on` fst) 
                      [(p x, x) | x <- xs]  -- [ (True, 'a'), ... ]
         , let (t:_, as) = unzip g          -- ( [True, ...], "aaa" )
         , a <- if t then as else (f as) ]  -- final concat

         -- unzip :: [(b, a)] -> ([b], [a])

We break the list into same-p spans and unpack each group with the help of unzip. Trying it out:

> foo (=='a') reverse "aaabcdeaa"
"aaaedcbaa"

So no, using == 'a' is avoidable and hence not especially good, introducing an unnecessary constraint on your data type when all we need is equality on Booleans.

Upvotes: 0

Daniel Wagner

Reputation: 152717

There are other answers here, but I think they get too excited about iteration abstractions. A manual recursion, alternately taking things that match the predicate and things that don't, makes this problem exquisitely simple:

onRuns :: Monoid m => (a -> Bool) -> ([a] -> m) -> ([a] -> m) -> [a] -> m
onRuns p = go p (not . p) where
    go _ _ _ _ [] = mempty
    go p p' f f' xs = case span p xs of
        (ts, rest) -> f ts `mappend` go p' p f' f rest

Try it out in ghci:

Data.Char> onRuns ('a'==) id (reverse . map toUpper) "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"

Upvotes: 1

Håkan

Reputation: 154

Here is a simple solution - function process below - that only requires that you define two functions isSpecial and func. Given a constructor from your type SomeType, isSpecial determines whether it is one of those constructors that form a special sublist or not. The function func is the one you included in your question; it defines what should happen with the special sublists.

The code below is for character lists. Just change isSpecial and func to make it work for your lists of constructors.

isSpecial c = c /= 'a'
func = reverse . map toUpper

turn = map (\x -> ([x], isSpecial x)) 

amalgamate []  = []
amalgamate [x] = [x]
amalgamate ((xs, xflag) : (ys, yflag) : rest)
   | xflag /= yflag = (xs, xflag) : amalgamate ((ys, yflag) : rest)
   | otherwise      = amalgamate ((xs++ys, xflag) : rest)

work = map (\(xs, flag) -> if flag then func xs else xs)

process = concat . work . amalgamate . turn

Let's try it on your example:

*Main> process "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
*Main>

Applying one function at a time, shows the intermediate steps taken:

*Main> turn "aaassddddfaaaffddsssadddssdffsdf"
[("a",False),("a",False),("a",False),("s",True),("s",True),("d",True),
("d",True),("d",True),("d",True),("f",True),("a",False),("a",False),
("a",False),("f",True),("f",True),("d",True),("d",True),("s",True),
("s",True),("s",True),("a",False),("d",True),("d",True),("d",True),
("s",True),("s",True),("d",True),("f",True),("f",True),("s",True),
("d",True),("f",True)]
*Main> amalgamate it
[("aaa",False),("ssddddf",True),("aaa",False),("ffddsss",True),
("a",False),("dddssdffsdf",True)]
*Main> work it
["aaa","FDDDDSS","aaa","SSSDDFF","a","FDSFFDSSDDD"]
*Main> concat it
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"
*Main>

Upvotes: 0

Jon Purdy

Reputation: 54981

You could classify the list elements by the predicate before grouping. Note that I’ve reversed the sense of the predicate to indicate which elements are subject to the transformation, rather than which elements are preserved.

{-# LANGUAGE ScopedTypeVariables #-}

import Control.Arrow ((&&&))
import Data.Function (on)
import Data.Monoid (First(..))

mapSegmentsWhere
  :: forall a. (a -> Bool) -> ([a] -> [a]) -> [a] -> [a]
mapSegmentsWhere p f
  = concatMap (applyMatching . sequenceA)  -- [a]
  . groupBy ((==) `on` fst)                -- [[(First Bool, a)]]
  . map (First . Just . p &&& id)          -- [(First Bool, a)]
  where
    applyMatching :: (First Bool, [a]) -> [a]
    applyMatching (First (Just matching), xs)
      = applyIf matching f xs

    applyIf :: forall a. Bool -> (a -> a) -> a -> a
    applyIf condition f
      | condition = f
      | otherwise = id

Example use:

> mapSegmentsWhere (/= 'a') (reverse . map toUpper) "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"

Here I use the First monoid with sequenceA to merge the lists of adjacent matching elements from [(Bool, a)] to (Bool, [a]), but you could just as well use something like map (fst . head &&& map snd). You can also skip the ScopedTypeVariables if you don’t want to write the type signatures; I just included them for clarity.

Upvotes: 1

danidiaz

Reputation: 27756

If we need to remember the difference between the 'a's and the rest, let's put them in different branches of an Either. In fact, let's define a newtype now that we are at it:

{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE ViewPatterns #-}

import Data.Bifoldable
import Data.Char
import Data.List

newtype Bunched a b = Bunched [Either a b] deriving (Functor, Foldable)

instance Bifunctor Bunched where
  bimap f g (Bunched b) = Bunched (fmap (bimap f g) b)

instance Bifoldable Bunched where
  bifoldMap f g (Bunched b) = mconcat (fmap (bifoldMap f g) b)

fmap will let us work over the non-separators. fold will return the concatenation of the non-separators, bifold will return the concatenation of everything. Of course, we could have defined separate functions unrelated to Foldable and Bifoldable, but why avoid already existing abstractions?

To split the list, we can use an unfoldr that alternately searches for as and non-as with the span function:

splitty :: Char -> String -> Bunched String String
splitty c str = Bunched $ unfoldr step (True, str)
  where
    step (_, []) = Nothing
    step (True, span (== c) -> (as, ys)) = Just (Left as, (False, ys))
    step (False, span (/= c) -> (xs, ys)) = Just (Right xs, (True, ys))

Putting it to work:

ghci> bifold . fmap func . splitty 'a' $ "aaassddddfaaaffddsssadddssdffsdf"
"aaaFDDDDSSaaaSSSDDFFaFDSFFDSSDDD"

Note: Bunched is actually the same as Tannen [] Either from the bifunctors package, if you don't mind the extra dependency.

Upvotes: 1

Given a list, how can I perform some transformation only on sub-lists whose each two elements satisfy a binary predicate?

Answers (5)

Related Questions