pechenie
pechenie

Reputation: 1958

Best way for substring replace in haskell

The problem is quite simple: I have to replace all occurences of "fooo" and all its substrings with "xyz". In Java, for example, I will do it like this:

someString.replaceAll( "fooo|foo|fo", "xyz" )

and it will do the trick. But in Haskell I've found no efficient way to work with regex. First of all, I've read this: http://www.haskell.org/haskellwiki/Regular_expressions

The only library which actually has replace function is regex-posix, but its considered "very slow" in performance. And this fact is not acceptable. Also I've found that this replace function for any reasons doesn't respect the order of patterns given, so I've got output like this:

>replace "boo fooo boo" "xyz"
"boo xyzoo boo"

Other backends don't imply such functionality.

So I decided to write simple workaround:

replaceFoo input =
    helper input []
    where
        helper ('f':'o':'o':'o':xs) ys = helper xs ("zyx" ++ ys)
        helper ('f':'o':'o':xs) ys = helper xs ("zyx" ++ ys)
        helper ('f':'o':xs) ys = helper xs ("zyx" ++ ys)
        helper (x:xs) ys = helper xs (x:ys)
        helper [] ys = reverse ys

Whilst I don't find this function nice, it works well and fast. But for now I met the necessity to add more words in this replacor, and I don't like the idea to extend helper patterns anymore (I need to say that I actually have 4 words in it in real app and that's odd).

I'll be happy if someone help me with fast solution.


cebewee, thanks for the Data.String.Utils. But I fear this approach is quite slow if there are many words to replace ("fooo" to "xyz", "foo" to "xyz", "fo" to "xyz", "bar" to "quux" and so on), because to get that to work I will need to foldr (\str (from,to) -> replace from to str) input pairs or something like that and it will take O(n*n). More than that, it may have unexpected result of replacing substring of result of previous replacement.

Upvotes: 7

Views: 5399

Answers (3)

James Brock
James Brock

Reputation: 3426

The replace-megaparsec package allows you to search for pattern matches and then edit the found matches. Here is a solution using Replace.Megaparsec.streamEdit.

>>> import Replace.Megaparsec
>>> import Text.Megaparsec.Char
>>> streamEdit (chunk "fooo" <|> chunk "foo" <|> chunk "fo") (const "xyz") "boo fooo boo"
"boo xyz boo"

Upvotes: 0

stephen tetley
stephen tetley

Reputation: 4493

The regex-xmlschema package has a sed function that might be what you're looking for:

http://hackage.haskell.org/package/regex-xmlschema-0.1.3

See particularly:

http://hackage.haskell.org/packages/archive/regex-xmlschema/0.1.3/doc/html/Text-Regex-XMLSchema-String.html#v:sed

There was a discussion of options for string rewriting on Haskell-Cafe last year:

http://www.haskell.org/pipermail/haskell-cafe/2010-May/077943.html

Upvotes: 4

Lars Noschinski
Lars Noschinski

Reputation: 3667

There is Data.String.Utils.replace in the MissingH package. If you only need plain substring replace (and not regular expressions), this might be what you need.

Upvotes: 7

Related Questions