Reputation: 30842
We're in a period of development where there are a lot of code that is created which may be short-lived, as it's effectively scaffolding which at some point gets replaced with something else, but will often continue to exist and be forgotten about.
Are there any good techniques for finding the classes in a codebase that aren't used? Obviously there will be many false positives (eg library classes: you might not be using all the standard containers, but you want to know they're there), but if they were listed by directory then it may make it easier to see at a glance.
I could write a script that greps for all class XXX
then searches again for all instances, but has to omit results for the cpp file that the class's methods were defined in. This would also be incredibly slow - O(N^2) for the number of classes in the codebase
Code coverage tools aren't really an option here as this is has a GUI that can't have all functions easily invoked programmatically.
Platforms are Visual Studio 2013 or Xcode/clang
EDIT: I don't believe this to be a duplicate of the dead code question. Although there is an overlap, identifying dead or unreachable code isn't quite the same as finding unreferenced classes.
Upvotes: 4
Views: 768
Reputation: 21721
If you're on linux, then you can use g++
to help you with this.
I'm going to assume that only when an instance of the class is created will we consider it as being used. Therefore, rather than looking just for the name of the class you could look for calls to the constructors.
struct A
{
A () { }
};
struct B
{
B () { }
};
struct C
{
C () { }
};
void bar ()
{
C c;
}
int main ()
{
B b;
}
On linux at least, running nm
on the binary has the following mangled names:
00000000004005bc T _Z3barv
00000000004005ee W _ZN1BC1Ev
00000000004005ee W _ZN1BC2Ev
00000000004005f8 W _ZN1CC1Ev
00000000004005f8 W _ZN1CC2Ev
Immediately we can tell that none of the constructors for 'A' are called.
Using slightly modified information from this SO answer we can also get g++
to remove function call graphs that are not used:
Which results in:
00000000004005ba W _ZN1BC1Ev
00000000004005ba W _ZN1BC2Ev
So, on linux at least, you can tell that neither A nor C is required in the final executable.
Upvotes: 1
Reputation: 30842
I've come up with a simple shell script that will at least help to focus attention on the classes that are referenced the least. I've made the assumption that if a class isn't used then it's name will still appear in one or two files (declaration in the header and definition in the cpp file). So the script uses ctags to search for class declarations in a source directory. Then for each class it does a recursive grep to find all the files that mention the class (note: you can specify different class and usage directories), and finally it writes the file counts and class names to a file and displays them in numerical order. You can then review all the entries that only had 1 or 2 mentions.
#!/bin/bash
CLASSDIR=${1:-}
USAGEDIR=${2:-}
if [ "${CLASSDIR}" = "" -o "${USAGEDIR}" = "" ]; then
echo "Usage: find_unreferenced_classes.sh <classdir> <usagedir>"
exit 1
fi
ctags --recurse=yes --languages=c++ --c++-kinds=c -x $CLASSDIR | awk '{print $1}' | uniq > tags
[ -f "counts" ] && rm counts
for class in `cat tags`;
do
count=`grep -l -r $class $USAGEDIR --include=*.h --include=*.cpp | wc -l`
echo "$count $class" >> counts
done
sort -n counts
Sample output:
1 SomeUnusedClassDefinedInHeader
2 SomeUnusedClassDefinedAndDeclaredInHAndCppFile
10 SomeClassUsedLots
Upvotes: 0