Reputation: 145
I have a line from which I want to remove all words beginning with the symbol @
, I do not fully understand how to do it expressively. It is clear that you could write something like this:
Split the string into words
Use the list filter to weed out unnecessary words
But I guess I don't understand how to break lines, because in addition to the space, there are such characters as \t
and \n
, besides, I will lose them and can not restore the original text.
An example of what I want to get:
original string:
haha lala\n@delete_me all-ok
expected result:
haha lala\nall-ok
Upvotes: 0
Views: 76
Reputation: 55059
Another way to look at the problem is that we want to delete strings of non-spaces that begin with an at sign @
, as well as any following spaces. We don’t want to treat line breaks or other characters specially at all. That can be expressed with a simple recursive function using span
/ break
and dropWhile
:
censor :: String -> String
censor "" = ""
censor text0 = spaces ++ nonspaces ++ censor rest
where
(spaces, text1) = span isSpace text0
(word, text2) = break isSpace text1
(nonspaces, rest)
| banned word
= ("", trim text2)
| otherwise
= (word, text2)
banned :: String -> Bool
banned ('@' : _) = True
banned _ = False
trim :: String -> String
trim = dropWhile isSpace
Consider an example:
censor " send @beans money to [email protected]"
span
returns " "
and "send @beans…"
break
returns "send"
and " @beans…"
banned
returns false for "send"
, so we will keep itcensor " @beans money…"
span
returns " "
and "@beans money…"
break
returns "@beans"
and " money…"
banned
returns true for "@beans"
, so we drop it and trim the restcensor "money…"
[email protected]
, since it is not banned
censor ""
returns ""
The end result is this expression:
" " ++ "send" ++ " " ++ "" ++ "money" ++ " " ++ "to" ++ " " ++ "[email protected]" ++ ""
Notice that we use a series of updates to the input string resulting in a series of variables text0
, text1
, text2
, rest
for the intermediate states. Consider how you could express this pattern using State
instead.
Upvotes: 1
Reputation: 6703
You might want to use Data.List.Split.split
with Data.List.Split.oneOf
.
It returns split words including separators, so you can rebuild text with them.
split (oneOf "xyz") "aazbxyzcxd" == ["aa","z","b","x","","y","","z","c","x","d"]
Upvotes: 1