Ivan Velho
Ivan Velho

Reputation: 13

gcc module size limits

I created a program in C that writes a program as its output.

The objective is to test aspects of performance in monolithic programs.

The first test was configured with 10 000 iterations and the result program compiled and ran. The second test with 100 000 iterations is compiling (3030 minutes until now) in Ubuntu 12.04 x86_64 in an i7 3770 with 16 GB RAM (16 GB SWAP).

I know that a parse complexity is from O(n**2) to O(n**3), but this is taking too long. In the worst scenario, 1000 times more compile timing.

Is consuming 35,2% of memory and still increases.

My question is:

The original program generator is:

#include <stdio.h>
#define MAX_INTERACTION 100000

int main(int argc, char **argv)
{
FILE * fp;
fp = fopen("source.c","w");

fprintf(fp,"#include <stdio.h> \n \n \n");
fprintf(fp,"int main(int argc, char **argv) \n");
fprintf(fp,"{ \n");

// local variables and exchange variables
for (int i=0; i< MAX_INTERACTION ; ++i)
{
    // passed variable, return label , local variable
    fprintf(fp," int pv%d , rl%d, loc%d ; \n",i,i,i); 
}

fprintf(fp," int pvd =0 ;\n \n \n");

//code blocks
for (int i=0; i< MAX_INTERACTION ; ++i)
{
    fprintf(fp," block%d : \n",i);
    fprintf(fp," loc%d = pv%d +1 ; \n",i,i);
    fprintf(fp," goto  rl%d; \n",i);
}

//call blocks
for (int i=1; i< MAX_INTERACTION +1; ++i)
{
    fprintf(fp," pvd = pv%d ;\n",(i-1));
    fprintf(fp," goto block%d; \n",(i-1));
    fprintf(fp," rl%d: \n",(i-1));
}

fprintf (fp,"printf( \"Concluido \\n \"); \n");
fprintf(fp,"}\n");

fclose(fp);
}

Upvotes: 1

Views: 1788

Answers (1)

Jonathan Leffler
Jonathan Leffler

Reputation: 754440

I did some timing on a MacBook Pro with 8 GB main memory (2.3 GHz Intel Core i7).

Modified the generator to take a parameter indicating the program size, and then ran it repeatedly:

$ for size in 10 100 1000 2000 3000 4000 5000 10000 20000 30000
> do
>     ./generator $size
>     echo $size
>     time make -s source 2>/dev/null
>     sleep 1
> done
10

real    0m0.526s
user    0m0.030s
sys     0m0.029s
100

real    0m0.084s
user    0m0.031s
sys     0m0.018s
1000

real    0m0.333s
user    0m0.235s
sys     0m0.044s
2000

real    0m0.392s
user    0m0.318s
sys     0m0.046s
3000

real    0m0.786s
user    0m0.661s
sys     0m0.070s
4000

real    0m0.657s
user    0m0.599s
sys     0m0.053s
5000

real    0m0.978s
user    0m0.893s
sys     0m0.069s
10000

real    0m3.063s
user    0m2.770s
sys     0m0.149s
20000

real    0m8.109s
user    0m7.315s
sys     0m0.274s
30000

real    0m21.410s
user    0m19.553s
sys     0m0.483s

$

Clearly, at the small sizes, there is overhead in simply getting the compiler run (especially the first run!), reading and writing files, etc. Redoing in multiples of 10,000 up to 90,000, I got the results in the table below. The slowdown is appreciable, especially at between 20,000 and 30,000. I also got fairly significant variability in the times for any given size. And making the code configurable and then running incrementally when you run into a problem is only sensible.

                    Compared to 10K
Size       Time     Size Ratio   Time Ratio  Log Time Ratio
 10K        3.7      1.00          1.00       0.000
 20K        8.1      2.00          2.19       0.784
 30K       25.1      3.00          6.78       1.915
 40K       45.2      4.00         12.22       2.503
 50K       76.7      5.00         20.73       3.032
 60K      110.5      6.00         29.96       3.397
 70K      176.0      7.00         47.57       3.862
 80K      212.0      8.00         57.30       4.048
 90K      292.3      9.00         79.00       4.369
100K      363.3     10.00         98.19       4.587

Your mileage will vary, of course.

For reference, the GCC I'm using is:

i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2336.9.00)

The compilation command line is:

/usr/bin/gcc -O3 -g -std=c99 -Wall -Wextra -Wmissing-prototypes -Wstrict-prototypes \
       -Wold-style-definition source.c -o source 

A home-built GCC 4.7.1 took 34.6 s for 10K, and 150.9 s for 20K.

Upvotes: 1

Related Questions