Bula
Bula

Reputation: 2792

Recursing directories only goes one file deep

I have the following code:

find_info(File) ->
    case file:read_file_info(File) of
        {ok, Facts} -> 
            case Facts#file_info.type of
                directory -> directory;
                regular -> regular
            end;
        {error,Reason} -> exit(Reason)
    end.

find_files(Dir,Flag,Ending,Acc) -> 
    case file:list_dir(Dir) of
        {ok,A} -> find_files_helper(A,Dir,Flag,Acc,Ending);
        {_,_} -> Acc
    end. 

find_files_helper([H|Tail],Dir,Flag,Acc,Ending) ->
    A = find_info(filename:absname_join(Dir,H)),
    case A of
        directory -> 
            case Flag of
                true -> 
                    find_files(filename:absname_join(Dir,H),Flag,Ending,Acc ++ find_files_helper(Tail,Dir,Flag,Acc,Ending));
                false -> find_files_helper(Tail,Dir,Flag,Acc,Ending)
            end;
        regular -> 
        case filename:extension(H) of
            Ending -> find_files_helper(Tail,Dir,Flag,[to_md5_large(H)] ++ Acc, Ending);
            _ -> find_files_helper(Tail,Dir,Flag,Acc,Ending)
        end;
        {error,Reason} -> exit(Reason)
    end;
find_files_helper([],_,_,Acc,_) -> Acc.

However whenever I run the find_files/4 the program only goes one file deep before crashing. Say I have the following directory

home/
   a/
     ser.erl
   b/
   c/
file.erl
file2.erl

When run I will get the md5 of file.erl of file2.erl and of ser.erl. However if the directory looks like this:

home/
   a/
     ser.erl
     back.erl
   b/
   c/
file.erl
file2.erl

Then the whole program crashes. I have spent few good hours looking for what I'm missing here in my logic however I have no idea.

The error message that I get is exception enoent in function p:to_md5_large/1.

In case the md5 is needed here it is:

to_md5_large(File)  ->

        case file:read_file(File) of
            {ok, <<B/binary>>} -> md5_helper(B,erlang:md5_init());
            {error,Reason} -> exit(Reason)
        end.

md5_helper(<<A:4/binary,B>>,Acc) -> md5_helper(B,erlang:md5_update(Acc,A));
md5_helper(A,Acc) -> 
    B =     erlang:md5_update(Acc,A),
    erlang:md5_final(B).

Upvotes: 0

Views: 90

Answers (2)

Pascal
Pascal

Reputation: 14042

There is a function that does this for you:

fold_files(Dir, RegExp, Recursive, Fun, AccIn) -> AccOut

in your case:

Result = filelib:fold_files(Dir, ".*\.erl", true, fun(X,Acc) -> {ok,B} = file:read_file(X), [erlang:md5(B)|Acc] end, []).

[edit]

@Bula:

I didn't answer directly to your question for 2 reasons:

  • The first one is that, at the time I was writing my answer, you didn't provide the type of error you get. It is very important, with any language, to learn how to get information from error report. In erlang, most of the time, you get the error type an the line where it occurs, looking at the documentation you will have a very helpful information about what was going wrong. By the way, unless you want to manage errors, I discourage you to write things like:

    case file:read_file(File) of
        {ok, <<B/binary>>} -> md5_helper(B,erlang:md5_init());
        {error,Reason} -> exit(Reason)
    end.
    

The following code will do the same, shorter, and you'll get the exact line number where you got an issue (its not the best example in your code, but it's the shorter)

    {ok, <<B/binary>>} = file:read_file(File),
    md5_helper(B,erlang:md5_init()),
  • The second is that I find your code too big, with useless helper functions. I think it is important to try to have a concise and readable code, and also to try to use the library function in the right way. For example you are using erlang:md5:init/0, erlang:md5_update/2 and erlang:md5_final/1 while a single call to erlang:md5/1 is enough in your case. The way you use it exists to be able to calculate the md5 when you get the data chunk by chunk, which is not your case, and the way you wrote the helper function does not allow to use this feature.

I don't understand why you want to have a "deployed" version of your code, but I propose you another version where I tried to follow my advices (written directly in the shell, so it need R17+ for the definition of recursive anonymous function) :o)

1> F = fun F(X,D,Ending) ->                                                                     
1>   {ok,StartD} = file:get_cwd(),        %% save current directory                                    
1>   ok = file:set_cwd(D),                %% move to the directory to explore                          
1>   R = case filelib:is_dir(X) of                                                                
1>     true ->                            %% if the element to analyze is a directory                                  
1>       {ok,Files} = file:list_dir(X),   %% getits content                                       
1>       [F(Y,X,Ending) || Y <- Files];   %% and recursively analyze all its elements             
1>     false -> 
1>       case filelib:is_regular(X) andalso (filename:extension(X) == Ending) of         
1>         true ->                        %% if it is a regular file with the right extension              
1>           {ok,B} = file:read_file(X),  %% read it                                               
1>           [erlang:md5(B)];             %% and calculate the md5 (must be return in a list                                            
1>                                        %% for consistancy with directory results)
1>         false ->                                                                                
1>           []                           %% in other cases (symlink, ...) return empty                                     
1>       end                                                                                       
1>   end,                                                                                        
1>   ok = file:set_cwd(StartD),           %% restore current directory                                    
1>   lists:flatten(R)                     %% flatten for nicer result                                       
1> end.                                                                                          
#Fun<erl_eval.42.90072148>
2> Md5 = fun(D) -> F(D,D,".erl") end.
#Fun<erl_eval.6.90072148>
3> Md5("C:/My programs/erl6.2/lib/stdlib-2.2").
[<<150,238,21,49,189,164,184,32,42,239,200,52,135,78,12,
   112>>,
 <<226,53,12,102,125,107,137,149,116,47,50,30,37,13,211,243>>,
 <<193,114,120,24,175,27,23,218,7,169,146,8,19,208,73,255>>,
 <<227,219,237,12,103,218,175,238,194,103,52,180,132,113,
   184,68>>,
 <<6,16,213,41,39,138,161,36,184,86,17,183,125,233,20,125>>,
 <<23,208,91,76,69,173,159,200,44,72,9,9,50,40,226,27>>,
 <<92,8,168,124,230,1,167,199,6,150,239,62,146,119,83,36>>,
 <<100,238,68,145,58,22,88,221,179,204,19,26,50,172,142,193>>,
 <<253,79,101,49,78,235,151,104,188,223,55,228,163,25,16,
   147>>,
 <<243,189,25,98,170,97,88,90,174,178,162,19,249,141,94,60>>,
 <<237,85,6,153,218,60,23,104,162,112,65,69,148,90,15,240>>,
 <<225,48,238,193,120,43,124,63,156,207,11,4,254,96,250,204>>,
 <<67,254,107,82,106,87,36,119,140,78,216,142,66,225,8,40>>,
 <<185,246,227,162,211,133,212,10,174,21,204,75,128,125,
   200,...>>,
 <<234,191,210,59,62,148,130,187,60,0,187,124,150,213,...>>,
 <<199,231,45,34,185,9,231,162,187,130,134,246,54,...>>,
 <<157,226,127,87,191,151,81,50,19,116,96,121,...>>,
 <<15,59,143,114,184,207,96,164,155,44,238,...>>,
 <<176,139,190,30,114,248,0,144,201,14,...>>,
 <<169,79,218,157,20,10,20,146,12,...>>,
 <<131,25,76,110,14,183,5,103,...>>,
 <<91,197,189,2,48,142,67,...>>,
 <<94,202,72,164,129,237,...>>,
 <<"^NQÙ¡8hÿèkàå"...>>,<<"ðÙ.Q"...>>,
 <<150,101,76,...>>,
 <<"A^ÏrÔ"...>>,<<"¹"...>>,<<...>>|...]
4>

Upvotes: 1

Steve Vinoski
Steve Vinoski

Reputation: 20014

You're getting enoent because you're passing a filename like back.erl to to_md5_large when you're not in the directory where back.erl is located. Try passing the full filename instead. You're already calling filename:absname_join(Dir,H) in find_files_helper, so just save that to a variable and pass that variable instead of H to to_md5_large.

Upvotes: 1

Related Questions