kevin
kevin

Reputation: 168

how to filter special char from long binary in erlang?

Purpose: remove all "/r/n..../r/n" from a long binary.

I have a very long binary(about 10000 size) like

<<"authtoken1,authtoken2...authtoken1000,\r\n1000\r\n,authoken1001,authoken1002...authken2000,\r\n15df\r\nauthoken2001,authoken2002..authoken7600....authoken10100">>.

Want to :

<<"authtoken1,authtoken2...authtoken1000,authoken1001,authoken1002...authken2000,authoken2001,authoken2002..authoken7600....authoken10100">>.

My temporary solution is

13> Bin = <<"authtoken1,authtoken2,\r\n1\r\nauthoken3,\r\n2\r\nauthtoken4,authtoken5,\r\n3\r\nauthtoken6,authtoken7,authtoken8,\r\n2\r\nauthtoken9,authtoken10">>.
<<"authtoken1,authtoken2,\r\n1\r\nauthoken3,\r\n2\r\nauthtoken4,authtoken5,\r\n3\r\nauthtoken6,authtoken7,authtoken8,\r\n2\r\nauthtoken"...>>
14>  Bin2 = binary:split(Bin,[<<"\r\n">>],[global,trim]).
[<<"authtoken1,authtoken2,">>,<<"1">>,<<"authoken3,">>,
 <<"2">>,<<"authtoken4,authtoken5,">>,<<"3">>,
 <<"authtoken6,authtoken7,authtoken8,">>,<<"2">>,
 <<"authtoken9,authtoken10">>]
15> lists:foldl(fun(AuthToken,Acc) -> case erlang:size(AuthToken) >4 of true -> <<Acc/binary,AuthToken/binary>>; false -> Acc end end, <<>>, Bin2).
<<"authtoken1,authtoken2,authoken3,authtoken4,authtoken5,authtoken6,authtoken7,authtoken8,authtoken9,authtoken10">>

It's work,but not efficiency.

Upvotes: 2

Views: 164

Answers (2)

kevin
kevin

Reputation: 168

For reference only:

1> Bin = <<"authtoken1,authtoken2,\r\n1\r\nauthoken3,\r\n2\r\nauthtoken4,authtoken5,\r\n3\r\nauthtoken6,authtoken7,authtoken8,\r\n2\r\nauthtoken9,authtoken10">>.
<<"authtoken1,authtoken2,\r\n1\r\nauthoken3,\r\n2\r\nauthtoken4,authtoken5,\r\n3\r\nauthtoken6,authtoken7,authtoken8,\r\n2\r\nauthtoken"...>>
2> re:replace(Bin, <<"\r\n\d+\r\n">>, <<"">>, [global, {return, binary} ]).
<<"authtoken1,authtoken2,authoken3,authtoken4,authtoken5,authtoken6,authtoken7,authtoken8,authtoken9,authtoken10">>

Upvotes: 2

Steve Vinoski
Steve Vinoski

Reputation: 20004

I think you're asking for a result binary that contains only comma-separated authtoken data, with all other data removed? If so, try this:

1> {ok,Pattern} = re:compile("authtoken\\d+").
{ok,{re_pattern,0,0,0,
                <<69,82,67,80,91,0,0,0,0,0,0,0,81,0,0,0,255,255,255,255,
                  255,255,...>>}}
2> {match,Found} = re:run(InputBinary,Pattern,[global,{capture,all,binary}]).
{match,[[<<"authtoken1">>],
        [<<"authtoken2">>],
        [<<"authtoken1000">>]]}
3> lists:foldl(fun([V],<<>>) -> <<V/binary>>;
                  ([V],Acc) -> <<Acc/binary,$,,V/binary>> end, <<>>, Found).
<<"authtoken1,authtoken2,authtoken1000">>

Upvotes: 4

Related Questions