gpuguy
gpuguy

Reputation: 4585

Code is slower on Linux as compared to Windows

After modifying my C code, (written originally for Windows and compiled under VS 2008), I ran it on Linux. To my surprise it is now at least 10 times slower than windows version.

Using Profiler tools I figured out that the following function is consuming most of the time spent in the application:

/* advance by n bits */

void Flush_Buffer(N)
int N;
{
 int Incnt;


 ld->Bfr <<= N;

Incnt = ld->Incnt -= N;

if (Incnt <= 24)
{
if (System_Stream_Flag && (ld->Rdptr >= ld->Rdmax-4))
{
do
{
if (ld->Rdptr >= ld->Rdmax)
      Next_Packet();
    ld->Bfr |= Get_Byte() << (24 - Incnt);
    Incnt += 8;
  }
  while (Incnt <= 24);
}
else if (ld->Rdptr < ld->Rdbfr+2044)
{
  do
  {
    ld->Bfr |= *ld->Rdptr++ << (24 - Incnt);
    Incnt += 8;
  }
  while (Incnt <= 24);
}
else
{
  do
  {
    if (ld->Rdptr >= ld->Rdbfr+2048)
      Fill_Buffer();
    ld->Bfr |= *ld->Rdptr++ << (24 - Incnt);
    Incnt += 8;
  }
  while (Incnt <= 24);
}
ld->Incnt = Incnt;
}

}

This function was taking negligible time on windows. on Linux it is taking taking close to 14 sec. What wrong I have committed here?

There are no system calls here so this code section should be independent of OS specific calls and thus should run in identical time.

(My Guess: This function is being called multiple times, so may be the profiler is accumulating the time of all the calls too. In such a case I think one of the issues might be that the function is not getting its input parameter quickly as compared to windows case. )

What wrong I have committed here? Any guess?

Rgrds,

H

Upvotes: 4

Views: 497

Answers (2)

Kerrek SB
Kerrek SB

Reputation: 476940

This is more of a note than an answer, but it doesn't quite fit into a comment, so I hope you won't hold this against me.

The term "profiling" has several related, but different meanings. In an abstract context it means "measuring" your program, usually with respect to certain runtime data. However, it's not the same as simply "timing" your program. Timing is one form of profiling, but there are many others.

For example, suppose you're unsure whether some data structure should be a std::set (a tree) or a std::unordered_set (a hash table). There isn't a universal answer, since it depends on what you use it for and what data you're processing. It is entirely possible that you cannot know the right answer until you specify the actual, real-world data that you're going to use. In that case, "profile and decide" means that you make two versions of your program, run them both against your real data, and measure the runtime. The faster one is probably the one you want.

On the other hand, GCC has a tool that's called "profiler" which serves a very different purpose. It's an execution path profiler, if you will, which tells you where (i.e. in which function) your program spends most of its time. If you have a complex algorithm with lots of subroutines, you may not know which ones are the important ones, and again this may actually depend on your real-world input. In that case, the profiler can help you determined which functions are called the most given your input data, and you can concentrate optimisation efforts on those functions. Now "profile before optimising" means that you need to determine the priorities before getting to work.

That said, for the comparison you have in mind, you must not use the GCC profiler. Rather, compile on both platforms with all optimisations enabled and under release conditions, and then measure the runtime on the same set of input data.

Upvotes: 1

user811773
user811773

Reputation:

You can try to annotate all code paths in your code with counters. At the end of the program, each counter will contain information about how many times the code path has been executed. Comparing these numbers line-by-line between the Windows version and the Linux version may reveal that the program is following different code paths. Depending on the nature of the code paths, the differences may be able to explain why the Linux version is slower than the Windows version.

int count[100];

// Call this function at the end of program 
void PrintCounts() {
    int i;
    for(i=0; i<100; i++) printf("%d\n", count[i]);
}

void Flush_Buffer(int N) {
  int Incnt;

  ld->Bfr <<= N;

  Incnt = ld->Incnt -= N;

  if (Incnt <= 24) {
    count[0]++;
    if (System_Stream_Flag && (ld->Rdptr >= ld->Rdmax-4)) {
      count[1]++;
      do {
         count[2]++;
         ...

Upvotes: 0

Related Questions