Is there any way to get very precise information (number of used bytes, number of CPU instructions) about process in Unix?

Question

Here is the problem: there is a set of programs written in different languages (mostly Perl and Python). Every program x reads lines from stdin, does some work (parses line, updates data structes, no long queries to DB or fancy network communication, even disk IO is rare), and maybe prints something to stdout. The task is to write such program f, that given x and stdin, will sample such lines, that were most computationally hard for x. The idea is to use such lines for testing and benchmarking of x in future.

Here is the thing I am stuck into: f wraps x, reads a single line l from stdin, x is ready to process l, f passes l to x and immideatly starts to collect staticts about x. The thing is that I can not find any metric that would distinct computationaly hard and easy lines. For now I've tried two approaches:

Dump whole /proc/[x pid]/stat between runs of x. It almost does not change (even CPU ticks).
Just monitor x state (using the same /proc/[x pid]/stat) and try to measure time it was running. It does not differ between lines.

Maybe there are some high precision metrics? Like number of CPU commands run or number of bytes in memory used?

Here is actual code in Python that I've written, it is full of details so it is the last thing to read it I think https://gist.github.com/alexanderkuk/5630079#file-f-py .

Is there any way to get very precise information (number of used bytes, number of CPU instructions) about process in Unix?

Answers (1)

Related Questions