Reputation: 33
I need split the binary like this:
Bin = <<"Hello my friend">>.
split_by_space(Bin).
and get:
[<<"Hello">>, <<"my">>, <<"friend">>]
Upvotes: 2
Views: 885
Reputation: 1958
No big deal, you can use binary:split/3
:
1> Bin = <<"Hello my friend">>.
<<"Hello my friend">>
2> binary:split(Bin, <<" ">>, [global]).
[<<"Hello">>,<<"my">>,<<"friend">>]
3>
Upvotes: 0
Reputation: 26121
There is simpler and like 2-10x more efficient than Pouriya's solution:
split(Bin) when is_binary(Bin) ->
skip_spaces(Bin);
split(A) ->
error(badarg, [A]).
skip_spaces(<<>>) -> % empty
[];
skip_spaces(<<$\s, Rest/bytes>>) -> % the next space
skip_spaces(Rest);
skip_spaces(<<Bin/bytes>>) -> % not a space
get_word(Bin, 1).
get_word(Bin, I) ->
case Bin of
<<Word:I/bytes>> -> % the last word
[Word];
<<Word:I/bytes, $\s, Rest/bytes>> -> % the next word
[Word|skip_spaces(Rest)];
_ -> % a next char of the word
get_word(Bin, I+1)
end.
It parses with speed around 15-40MB/s on normal CPU.
Upvotes: 0
Reputation: 1626
If you don't want to use standard library, you can use:
-module(split).
%% API:
-export([split/1]).
split(Bin) when is_binary(Bin) ->
split(Bin, <<>>, []).
%% If there was more than one space
split(<<$ :8, Rest/binary>>, <<>>, Result) ->
split(Rest, <<>>, Result);
%% If we got space and buffer is not empty, we add buffer to list of words and make buffer empty
split(<<$ :8, Rest/binary>>, Buffer, Result) ->
split(Rest, <<>>, [Buffer|Result]);
%% If we got a character which is not a space, we add this character to buffer
split(<<Char:8, Rest/binary>>, Buffer, Result) ->
split(Rest, <<Buffer/binary, Char>>, Result);
%% If main binary and buffer are empty, we reverse the result for return value
split(<<>>, <<>>, Result) ->
lists:reverse(Result);
%% If main binary is empty and buffer has one or more character, we add buffer to list of words and reverse it for return value
split(<<>>, Buffer, Result) ->
lists:reverse([Buffer|Result]).
Test above code:
1> split:split(<<"test">>).
[<<"test">>]
2> split:split(<<" test ">>).
[<<"test">>]
3> split:split(<<" te st ">>).
[<<"te">>,<<"st">>]
4> split:split(<<"">>).
[]
5> split:split(<<" ">>).
[]
Upvotes: 1
Reputation: 2033
you can simply use lexemes:
http://erlang.org/doc/man/string.html
lexemes(String :: unicode:chardata(), SeparatorList :: [grapheme_cluster()]) -> [unicode:chardata()]
Returns a list of lexemes in String, separated by the grapheme clusters in SeparatorList.
string:lexemes("foo bar", " ").
["foo","bar"]
string:lexemes(<<"foo bar">>, " ").
[<<"foo">>,<<"bar">>]
The other function is split:
string:split(<<"foo bar">>, " ", trailing).
[<"foo">>,<<"bar">>]
Upvotes: 1