Reputation: 85

How do I figure out whether a file is used/called by another one?

I am currently in my second year of college, therefore my programming skills and knowledge are not as strong as I like them to be. I am doing an internship for a web development company during my summer break and I am completely stomped on the first task that was assigned to me. That's why I'm here asking for some assistance.

In a main folder there are many sub-folders and within each sub-folder there are many .js .cs and .php files - about 1000 files. But about 300 are not being used. I need to open up each of the sub-folders and see if any of these files are used/called by any other files. If they are not, I need to store the location of the unused file in a text file.

I did some research and found out that the command grep -r filename * does just that, but on the command-line I cannot figure out how to loop through the folders and change the filename based on the content inside the folders. The workstation I have is in Windows with Cygwin installed.

Upvotes: 2

Answers (4)

David W.

Reputation: 107080

Doesn't this require a double loop? (Big O²). You have to search each file for every instance of the file in it.

I'd use Perl instead of Awk or BASH (although it is possible to do in BASH).

#! /usr/bin/env perl

use warnings;
use strict;
use feature qw(say);

use File::Find;     #Not crazy about File::Find, but it's a standard module
use File::Basename;

my %fileHash;
my @dirs = qw(foo bar barfu fufu barbar);   #List of the directories you're searching

#Finds the name of all the files. Include ALL files and not just .php, etc.

find(\&wanted, @dirs);

sub wanted {
    next if (-d $File::Find::name); #Skip directories
    $fileHash{$File::Find::name} = 0;       #Number of times file is referenced
}

# Outer Loop: Foreach file you have to parse

foreach my $fileName (keys %fileHash) {

    # We don't have to grep anything except those below.
    (my $suffix = $fileName) =~ s/.*\.//;
    next unless ($suffix eq ".js" or $suffix eq ".cs" or $suffix eq ".php");

    #Slurp up file in an array. That way, we can use the grep command
    open (FILE, $fileName) or die qq(Can't open "$fileName" for reading\n);
    my @lines = <FILE>;
    close FILE;

    # Now, look for each and every file you've got in that directory tree
    # in this particular file. This is an inner loop

    foreach my $fileToFind (keys %fileHash) {
        my $basename = basename($fileToFind);

        # If any lines in the file contain the file name, increment the hash.
        if (grep /$basename/, @lines) {
            $fileHash{$fileToFind} += 1;
        }   
    }   
}   


#Now just print out those files who never got incremented (i.e. never referenced)
foreach my $fileName (keys %FileHash) {
    next if ($fileHash{$fileName} != 0);
    say "File: $fileHash{$fileName}"
}

I'm taking a shortcut of looking just for the file's basename and not the full name. In theory, I should be looking for both its full name from the root, and its name in relationship to the file itself. However, I'm too lazy to do that right now. Most likely, you don't have to worry about that.

Upvotes: 1

Paul Creasey

Reputation: 28864

echo file,count >results.csv
for f in $(find . -name *.js -o -name *.cs -o -name *.php)
do
    echo $f,$(grep -cr $(basename $f) *) >> results.csv
done

this will give you a csv file like this with the number of times each file is referenced.

file,count
file1,3
file2,1
file3,0

edited to remove file path before grepping

Upvotes: 1

Felipe Cardoso Martins

Reputation: 892

This is only a draft, you need research about all commands and do your own logic...

for file in $(find -type f -name \*.extension); do
    grep -Rl $file /in/path
done > /tmp/myfiles

Upvotes: 0

mkro

Reputation: 1892

phew, tricky. At least if you have to take into consideration the 'being used' bit.

In the case of .cs, you can have import statements that won't easily allow you to conclude whether a file is in use. The import might work on a package-level, unless I'm mistaken (being more of a java guy...).

And I assume it gets worse for JavaScript and php files.

Maybe you should ask, why that report is valuable in the first place?

Upvotes: 0

How do I figure out whether a file is used/called by another one?

Answers (4)

Related Questions