Reputation: 145
I am using unsigned long to collect performance data while profiling an application as follows
unsigned long time_taken = (unsigned long) ((finish_time_in_nano_sec - start_time_in_nano_sec)/1000);
Write time_taken to a file. Do this for every function that gets called in my app.
Now after app execution, have a separate program read the above file and sum up the time taken for every function including call counts such as
func_name aggregate_time call_counts
The problem I have is that for some functions, the aggregate_time field is a 20 digit value i.e. the maximum value an unsigned long can hold. This cannot be true because I measure time in microseconds and don't run my app for more than like 20 seconds. How can it then be a 20 digit value?
Do you see mistakes in Steps 1,2 and 3?
Regards, Krishna
EDIT:
1) Time measurement: clock_gettime(CLOCK_REALTIME, &start_time); clock_gettime(CLOCK_REALTIME, &finish_time); unsigned long time_taken = (unsigned long)((finish_time.tv_nsec - art_time.tv_nsec)/1000);
2) File write: fwrite(&time_taken, sizeof(unsigned long), 1, datafile);
3) File read: fread(&time_taken, sizeof(long), 1, datafile);
Upvotes: 3
Views: 1195
Reputation: 14900
Actually, it would be possible in some cases to obtain negative times. Considering that the time intervals are very small, they most probably come from using RDTSC instruction, that reads a counter that increments with every clock of the CPU. On some multicore AMD processors each core have its own counter, that is not synchronized with others, so it could be possible that the difference between two consecutive readings performed on different cores could be negative.
It can be solved easily by forcing the program to execute on a single core, for example by using Win32 API SetProcessAffinityMask
.
EDIT: Be careful how you measure times. Most time APIs have a horrible resolution, so measuring small intervals using them have no sense. Try using RDTSC explicitly.
And the most important: file I/O is such a slow function, that it can completely destroy useful information. For example, if you perform the measurements for both, the caller and the callee, and callee write the result to file, then the time measured for caller would include the time spent in fwrite
. Monitor also each write, so that you can see how much time is spent there (to avoid recursion, you can save the time took by a fwrite and log it with the next fwrite).
Upvotes: 0
Reputation: 239321
The problem is in your time measurement code:
clock_gettime(CLOCK_REALTIME, &start_time);
clock_gettime(CLOCK_REALTIME, &finish_time);
unsigned long time_taken = (unsigned long)((finish_time.tv_nsec - start_time.tv_nsec)/1000);
If a 1-second clock boundary is crossed between start_time
and finish_time
, then your subtraction could result in a negative value, and will then be converted to a very large positive unsigned value.
You need to take into account the tv_sec
member of the times as well:
unsigned long time_taken;
time_taken = (unsigned long)((finish_time.tv_sec - start_time.tv_sec) * 1000000);
time_taken += (unsigned long)(finish_time.tv_nsec / 1000);
time_taken -= (unsigned long)(start_time.tv_nsec / 1000);
(eg a clock time of 23989032.452 seconds would be representated as a struct timespec
with .tv_sec = 23989032
and .tv_nsec = 452000000
)
Upvotes: 5
Reputation: 754920
The problem isn't in the code you show - though it would be interesting to know the types of the times in nanoseconds.
Therefore, the problem is in the code you don't show:
Upvotes: 1