Reputation: 95880
I want to remove all the whitespace i..e tabs/spaces/newline chars.
T = {xmlelement,"presence",
[{"xml:lang","en"}],
[{xmlcdata,<<"\n">>},
{xmlelement,"priority",[],
[{xmlcdata,<<"5">>}]},
{xmlcdata,<<"\n">>},
{xmlelement,"c",
[{"xmlns",
"http://jabber.org/protocol/caps"},
{"node","http://psi-im.org/caps"},
{"ver","0.12.1"},
{"ext","cs ep-notify html"}],
[]},
{xmlcdata,<<"\n">>}]}.
I tried the following, but it does not work:
trim_whitespace(Input) ->
re:replace(Input, "(\r\n)*", "").
Upvotes: 1
Views: 5863
Reputation: 3344
re:replace is tricky, something to keep in mind:
Eshell V5.9.3.1 (abort with ^G)
1> re:replace("0 1 2 3 4 5 6 7 8 9", " ", "", [global, {return, list}]).
"0123456789"
2> re:replace("0 1 2 3 4 5 6 7 8 9", " ", "", [{return, list}]).
"01 2 3 4 5 6 7 8 9"
3> re:replace("0 1 2 3 4 5 6 7 8 9", " ", "").
[<<"0">>,[]|<<"1 2 3 4 5 6 7 8 9">>]
Upvotes: 0
Reputation: 2496
I faced the same issue… came here to share my more efficient work:
trim(Subject) ->
{match, [[Trimmed]|_]} = re:run(Subject, "^\\s*([^\\s]*(?:.*[^\\s]+)?)\\s*$",
[{capture, all_but_first, binary}, global, dollar_endonly, unicode, dotall]),
Trimmed.
The idea is very much the same. The regex is just better.
Upvotes: 2
Reputation: 7129
All the whitespace in your question is in cdata sections - why not just filter those out of the tuple?
remove_cdata(List) when is_list(List) ->
remove_list_cdata(List);
remove_cdata({xmlelement, Name, Attrs, Els}) ->
{xmlelement, Name, remove_cdata(Attrs), remove_cdata(Els)}.
remove_list_cdata([]) ->
[];
remove_list_cdata([{xmlcdata,_}|Rest]) ->
remove_list_cdata(Rest);
remove_list_cdata([E = {xmlelement,_,_,_}|Rest]) ->
[remove_cdata(E) | remove_list_cdata(Rest)];
remove_list_cdata([Item | Rest]) ->
[Item | remove_list_cdata(Rest)].
remove_cdata(T) =:=
{xmlelement,"presence",
[{"xml:lang","en"}],
[{xmlelement,"priority",[],[]},
{xmlelement,"c",
[{"xmlns","http://jabber.org/protocol/caps"},
{"node","http://psi-im.org/caps"},
{"ver","0.12.1"},
{"ext","cs ep-notify html"}],
[]}]}
Upvotes: 0
Reputation: 13842
If you want to remove everything in a string, you need to pass the global option to re:replace(). You're also only replacing newlines by using that regex. The call should probably look like this:
trim_whitespace(Input) -> re:replace(Input, "\\s+", "", [global]).
Upvotes: 4