d_vincent
d_vincent

Reputation: 29

Explanation needed on a "find | awk" command (Linux)

I am trying to figure out the command below;

find ./ -type f |
awk -F / -v OFS=/ '{$NF="";dir[$0]++} END {for (i in dir) print dir[i]i}'

the output is like:

6./Release_1_18_1_0_06_26/metadata/3_Control_Files/   
5./Release_1_18_1_0_06_26/metadata/7_SAS_Code/   
5./Release_1_18_1_0_06_26/others/1_content/   
1./.cache/pip/selfcheck/   
2./Release_1_18_1_0_06_26/metadata/5_Status/   
1./Release_1_18_1_0_06_26/compute/2_packages/   
1./sasuser.v94/   
4./metadata/FR1_Release_1.17.1.1/3_Control_Files/   
4./Release_1_18_1_0_06_26/metadata/6_Patches/   

This command am counting the number of incode in the current path. However, I did not understand {$NF="";dir[$0]++} END {for (i in dir) print dir[i]i}, especially dir[$0]. anyone can explain that?

Upvotes: 1

Views: 460

Answers (2)

RavinderSingh13
RavinderSingh13

Reputation: 133700

Could you please go through following, detailed explanation of OP's code here.

find ./ -type f |     ##Running find command to find files in current directory and passing output as input to awk command.
awk -F / -v OFS=/ '   ##Running awk command setting field separator and output field separator as /
{
  $NF=""              ##Nullifying last field of current line here.
  dir[$0]++           ##creating array dir with current line and keep increasing its value with one here.
}
END{                  ##starting END block of this code here.
  for(i in dir){      ##traversing through dir array here.
    print dir[i]i     ##printing index of dir array and it's index here.
  }
}'

NOTE:END block of any awk code is executed when Input_file is being done with reading at last.

Upvotes: 3

TheAmigo
TheAmigo

Reputation: 1072

The -F / tells awk to split the line by slashes. $0 is the whole line.

$NF="" replaces the last item on the line (in this case the filename) with blank.

Then dir[$0]++ takes the whole line (after the filename as been removed) and uses that as the index into a hash, incrementing that value by one. Effectively counting the number of items that had the same path.

The END block loops through all keys in the dir[] hash printing first the count, then the directory name.

Upvotes: 3

Related Questions