Carst
Carst

Reputation: 1614

Python dependencies inside a package

I know there are various discussions around this subject already, but I have a specific, slightly different question (most existing questions I have found focus on external (inter-)dependencies of other packaging, while my interest is mostly in my own direct package).

I have found a variety of tools that help to find & visualize interdependencies:

The problem I have with using is that they show all the dependencies of all the modules, while I really would like to focus on my own internal dependencies plus the "first" external dependency per module. As an example: I use pandas & scipy in many places, so I would like to see those referenced, but not the internal structure and dependencies of those packages on other stuff. You can imagine that those give a large explosion of other dependencies that are not in my control and therefore not of my direct interest.

Pycallgraph does work, but it gives gigantic results that obfuscate the tiny bit of the total dependencies that I'm interested in. Does anyone have any pointers? Do I need to build something more simple myself or am I overlooking something?

Thank you for help!

Edit:

So pycallgraph is not really handy for me as it really works by executing stuff. The problem with modulegraph is that (as said in the comment too) it creates this huge dot file (9000 lines). However (argh) it does not give dependencies on modules on the same package level. So if you have package "main" with modules "a", "b", "c" and a "main.file_import" with "x", "y", "z" it gives a dependency between "main" and "main.file_import". Which is not what i'm looking for, as i'm trying to figure out whether the actual structure should be re-factored (on module and on function/class level). I'll keep on adding things here, when I find or create a good solution for this. I had thought this to be a common issue though.

Upvotes: 3

Views: 2218

Answers (2)

JL Peyret
JL Peyret

Reputation: 12204

wrt to pycallgraph, I ended up with something somewhat useful, coming from basically the same point as you.

  1. hack pycallgraph to save the intermediate dot file somewhere you can see it.

  2. run egrep -v to trim out the stuff you don't care about in the dot. This is where you strip out all logging calls, for example.

  3. run gvpr, a DOT-manipulation utility that comes with graphviz to select the node you are interested in.

Basic proof of concept code is at https://gist.github.com/jpeyret/33739f6cd99f6108ad5046bd47df5a16

Upvotes: 1

borntyping
borntyping

Reputation: 3112

Snakefood can restrict the dependencies that it will draw: http://furius.ca/snakefood/doc/snakefood-doc.html#restricting-dependencies

You might also be able to use clustering to group all dependencies in the same package (e.g. only show pandas once): http://furius.ca/snakefood/doc/snakefood-doc.html#filtering-and-clustering-dependencies

Snakefood is also a good option if you plan on filtering the output, as it cat output data for each stage of it's processing.

Upvotes: 1

Related Questions