Jenia Be Nice Please
Jenia Be Nice Please

Reputation: 2693

split string with adjacent separators

I want to split a string with adjacent separators (if you remember the string:tokens ignored adjacent separators).

So I have this for now:

split(L, C) -> lists:reverse([lists:reverse(X) || X <- split(L, C, [[]])]).

split([], _, Acc) -> Acc;
split([C|T], C, Acc) -> split(T, C, [[]|Acc]);
split([H|T], C, [AH|AT]) -> split(T, C, [[H|AH]|AT]).

The return value happens to be ["12432524,,32453,4"] for the input tut6:split("12432524,,32453,4", ",").. I don't understand what the problem is. Can someone please point it out to me?

The required output is ["12432524", "", "32453", "4"]

Thanks in advance for your kind help.

Upvotes: 1

Views: 264

Answers (1)

Steve Vinoski
Steve Vinoski

Reputation: 20014

The problem is that you're comparing each character of the string you're splitting to the entire string separator. When you call

tut6:split("12432524,,32453,4", ",").

then the first invocation of split/3 inside your module is:

split([$1|"2432524,,32453,4"], ",", [[]]) ...

The head of the first argument is the character 1 but you compare it to the string ",", and of course that will never match and so your input string never gets split.

There are a few ways to fix this:

  • Have the caller pass a character separator rather than a string, like this:

    tut6:split(2432524,,32453,4", $,).
    
  • Have the caller pass a string separator but use only its first character as the actual separator. You can achieve this by changing split(L,C) in your code to either

    split(L,[C]) ->
        lists:reverse([lists:reverse(X) || X <- split(L, C, [[]])]).
    

    to force a single character separator string, or

    split(L,[C|_]) ->
        lists:reverse([lists:reverse(X) || X <- split(L, C, [[]])]).
    

    to use only the first character as the separator and ignore any trailing characters.

  • Have the caller pass a string separator and treat each character in the string as a potential separator.
  • Have the caller pass a string separator and treat the entire string as a separator.

You can achieve the last of these using re:split/3:

split(L, C) -> re:split(L, C, [{return,list}]).

This code isn't fully correct, though, as it works for a simple separator like "," but won't work in general unless you quote all regular expression metacharacters in the separator string.

Upvotes: 4

Related Questions