Erik
Erik

Reputation: 957

Filtering ANSI escape sequences from a ByteString with Conduit

I'm trying to make a Conduit that filters ANSI escape codes from ByteStrings. I've come up with a function that converts the ByteString into a stream of Word8's, does the filtering, and converts back into a stream of ByteStream at the end.

It seems to work fine when I use it in GHCi:

> runConduit $ yield "hello\27[23;1m world" .| ansiFilter .| printC
"hello world"

When I use it in my application, conduits that contain ansiFilter don't seem to pass anything through. Here is the full source:

{-# LANGUAGE OverloadedStrings #-}

module Main where

import Conduit
import Control.Concurrent.Async
import Control.Concurrent.STM
import Data.ByteString (ByteString)
import qualified Data.ByteString as BS
import Data.Conduit.TQueue
import Data.Word8 (Word8)
import qualified Data.Word8 as Word8

main :: IO ()
main = do

      queue <- atomically $ newTBQueue 25
      let qSource = sourceTBQueue queue
      atomically $ writeTBQueue queue ("hello" :: ByteString)

      race_
        (putInputIntoQueue queue)
        (doConversionAndPrint qSource)

putInputIntoQueue q =
  runConduit
    $ stdinC
    .| iterMC (atomically . writeTBQueue q)
    .| sinkNull

doConversionAndPrint src =
  runConduit
    $ src
    .| ansiFilter
    .| stdoutC

ansiFilter :: MonadIO m => ConduitM ByteString ByteString m ()
ansiFilter = toWord8 .| ansiFilter' .| toByteString
  where
    ansiFilter' = awaitForever $ \first -> do
      msecond <- peekC
      case (first, msecond) of
        (0x1b, Just 0x5b) -> do
          dropWhileC (not . Word8.isLetter)
          dropC 1
        _ -> yield first

    toWord8 = concatC

    toByteString :: Monad m => ConduitM Word8 ByteString m ()
    toByteString =
      (mapC BS.singleton .| foldC) >>= yield

This program is supposed to echo back the filtered contents of stdin, but nothing gets echoed back.

However, if I comment out the ansiFilter in doConversionAndPrint, echoing does work which makes me thing the ansiFilter function is wrong.

Any help would be greatly appreciated!

Upvotes: 6

Views: 264

Answers (2)

Michael Snoyman
Michael Snoyman

Reputation: 31315

I reimplemented ansiFilter in terms of the higher level chunked data functions in conduit-combinator, like takeWhileCE. This seems to work, and should be more efficient by letting more of the data remain in an efficient memory representation:

ansiFilter :: MonadIO m => ConduitM ByteString ByteString m ()
ansiFilter = loop
  where
    loop = do
      takeWhileCE (/= 0x1b)
      mfirst <- headCE
      case mfirst of
        Nothing -> return ()
        Just first -> assert (first == 0x1b) $ do
          msecond <- peekCE
          case msecond of
            Just 0x5b -> do
              dropWhileCE (not . Word8.isLetter)
              dropCE 1
            _ -> yield $ BS.singleton first
          loop

Upvotes: 2

Erik
Erik

Reputation: 957

Went with a slightly different approach and am having more luck leaving the ByteStrings alone. I think this gives up some of the streaming stuff, but is acceptable for my use-case.

ansiFilter :: Monad m => Conduit ByteString m ByteString
ansiFilter = mapC (go "")
  where
    csi = "\27["
    go acc "" = acc
    go acc remaining = go (acc <> filtered) (stripCode unfiltered)
      where
        (filtered, unfiltered) = BS.breakSubstring csi remaining
        stripCode bs = BS.drop 1 (BS.dropWhile (not . Word8.isLetter) bs)

Upvotes: 1

Related Questions