Jan Synáček
Jan Synáček

Reputation: 355

Haskell "string movement" functions

I'm trying to build a simple text editor using Haskell, hscurses and Data.Text. I'm quite new to Haskell.

Here is a snippet from my code:

data Cursor = Cursor {
  position :: Int,
  line     :: Int,
  column   :: Int
} deriving (Eq, Show)

isNewline :: Char -> Bool
isNewline c = c == '\n'

onNewline :: T.Text -> Int -> Bool
onNewline buf pos
  | pos >= T.length buf = False
  | otherwise           = isNewline $ T.index buf pos

findIndex :: (Char -> Bool) -> T.Text -> Int -> Maybe Int
findIndex pred buf pos
  | buf == T.empty = Just 0
  | otherwise      = rightWhile pos
  where rightWhile pos
          | pos > bufMax buf       = Nothing
          | pred $ T.index buf pos = Just pos
          | otherwise              = rightWhile (pos + 1)

findIndexLeft :: (Char -> Bool) -> T.Text -> Int -> Maybe Int
findIndexLeft pred buf pos = leftWhile pos
  where leftWhile pos
          | pos < 0                = Nothing
          | pred $ T.index buf pos = Just pos
          | otherwise              = leftWhile (pos - 1)

startOfLine :: T.Text -> Int -> Int
startOfLine buf pos = case findIndexLeft isNewline buf (pos - 1) of
  Nothing -> 0
  Just p  -> p + 1

endOfLine :: T.Text -> Int -> Int
endOfLine buf pos = case findIndex isNewline buf pos of
  Nothing -> 1 + bufMax buf
  Just p  -> p

lineOffset :: T.Text -> Int -> Int
lineOffset buf pos = pos - startOfLine buf pos

lineLength :: T.Text -> Int -> Int
lineLength buf pos = endOfLine buf pos - startOfLine buf pos

bufMax :: T.Text -> Int
bufMax buf = max 0 $ T.length buf - 1

bufLines :: T.Text -> Int
bufLines = T.foldl (\acc c -> if isNewline c then (acc+1) else acc) 0

moveCursorRight :: T.Text -> Cursor -> Cursor
moveCursorRight buf c@(Cursor pos line col)
  | buf == T.empty = c
  | otherwise      = Cursor newPos newLine newCol
  where end       = 1 + bufMax buf
        onEnd     = pos == end
        newPos    = clip (pos + 1) 0 end
        newLine   = if onNewline buf pos && not onEnd
                    then line + 1
                    else line
        newCol    = lineOffset buf newPos

moveCursorLeft :: T.Text -> Cursor -> Cursor
moveCursorLeft buf (Cursor pos line col) =
  Cursor newPos newLine newCol
  where onStart   = pos == 0
        newPos    = clipLow (pos - 1) 0
        newLine   = if onNewline buf newPos && not onStart
                    then line - 1
                    else line
        newCol    = lineOffset buf newPos

-- More movement functions follow...

The problem with this code is that for buffers that are thousands of lines long, it gets very slow. That's probably because of the use of index functions, that are O(n) and not constant time as they would be in C, for example.

How would a seasoned Haskeller approach this? What would be a reasonably efficient way to implement "movement" in strings in Haskell? The movement should also be composable, that is, I would like to be able to implement "Page down" in terms of "Move one line down", etc.

Edit: Update

Should anyone ever need this, here is what I ended up with.

type Line = T.Text

data BufferContext = BufferContext {
  before   :: [Line],
  at       :: Line,
  after    :: [Line]
} deriving (Eq, Show)

moveCursorRight :: Cursor -> Cursor
moveCursorRight c@(Cursor pos line col bc@(BufferContext before at after))
  | col >= T.length at = moveCursorDown c
  | otherwise          = Cursor (pos+1) line (col+1) bc

moveCursorLeft :: Cursor -> Cursor
moveCursorLeft c@(Cursor pos line col bc@(BufferContext before at after))
  | col <= 0  = upCursor { column = if null before then 0 else T.length $ head before }
  | otherwise = Cursor (pos-1) line (col-1) bc
  where upCursor = moveCursorUp c

moveCursorDown :: Cursor -> Cursor
moveCursorDown c@(Cursor _ _ _ (BufferContext _ _ [])) = c
moveCursorDown c@(Cursor _ cLine _ (BufferContext before at (l:ls))) =
  c { line    = cLine+1,
      column  = 0,
      context = BufferContext (at:before) l ls
    }

moveCursorUp c@(Cursor _ _ _ (BufferContext [] _ _)) = c
moveCursorUp c@(Cursor _ cLine _ (BufferContext (l:ls) at after)) =
  c { line = cLine-1,
      column = 0,
      context = BufferContext ls l (at:after)
    }

This implementation is very usable on 1 million lines and that is good enough for me. There is still one problem with this approach, though. If I want to jump to a random line, I have to move one by one, which can be slow. But still, this is a great improvement over the original approach.

I have also tried implementing the context as

data BufferContext = BufferContext {
  before   :: T.Text,
  at       :: Char,
  after    :: T.Text
} deriving (Eq, Show)

but that doesn't help too much, because "at" has to be consed with "before" and according to the documentation, T.cons is O(n)... Also, the line centric approach is better when the actual display is done.

Thanks to all who have helped!

Upvotes: 2

Views: 409

Answers (1)

Paul Johnson
Paul Johnson

Reputation: 17786

As gallais says in the comments, you want to use a zipper. The idea is that your "cursor" is actually a data structure like this:

type Line = T.Text

data TextZipper = TextZipper {
   textBefore :: [Line],
   currentLine :: Line,
   textAfter :: [Line]
}

The trick is that "textBefore" contains the lines above the cursor in reverse order. So to move down a line you put the "currentLine" at the head of "textBefore" and get the new "currentLine" out of "textAfter", like this:

moveDown :: TextZipper -> Maybe TextZipper
moveDown tzip = case textAfter tzip of
   [] -> Nothing    -- Already at the bottom of the file.
   t:ts -> TextZipper {
      textBefore = currentLine tzip : textBefore tzip,
      currentLine = t,
      textAfter = ts
   }

moveUp will be very similar. You will also need a textZipperToList function to extract the contents of the zipper for saving, and a textZipperFromList function.

I recall reading somewhere that Emacs uses a similar concept except that it does it by character rather than by line. The buffer is represented as two blocks of text, one the block before the cursor, and one the block after the cursor. Moving the cursor is accomplished by copying characters from one block to the other. Same concept here. Given this you might want to think about replacing each list of lines with a single Text value.

Upvotes: 3

Related Questions