Multi threading error segmentation fault in C

I am trying to multiply two matrices using multi threading. Here I compile the program using gcc in linux and run by inputting number of threads.

gcc multiThread.c -o test -lpthread
./test 4

Here I run the multiplication for matrices of N*N where N starting from 10 to 1000 with intervals of 10 and calculate the execution time tor each iteration. When I run the program it gives a segmentation fault. Please help.

#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>
#include<time.h>

int SIZE = 10;   // Size by SIZE matrices
int num_thrd;   // number of threads

int A[2000][2000], B[2000][2000], C[2000][2000];

// initialize a matrix
void init_matrix(int m[SIZE][SIZE])
{
  int i, j;
  for (i = 0; i < SIZE; i++)
    for (j = 0; j < SIZE; j++)
      m[i][j] = rand() % 100 + 1;
}

// thread function: taking "slice" as its argument
void* multiply(void* slice)
{
  int s = (int)slice;   // retrive the slice info
  int from = (s * SIZE)/num_thrd; // note that this 'slicing' works fine
  int to = ((s+1) * SIZE)/num_thrd; // even if SIZE is not divisible by num_thrd
  int i,j,k;

  printf("computing slice %d (from row %d to %d)\n", s, from, to-1);
  for (i = from; i < to; i++)
  {  
    for (j = 0; j < SIZE; j++)
    {
      C[i][j] = 0;
      for ( k = 0; k < SIZE; k++)
    C[i][j] += A[i][k]*B[k][j];
    }
  }
  printf("finished slice %d\n", s);
}

int main(int argc, char* argv[])
{
  FILE *outFile;
  outFile = fopen("Algorithm3_Times.txt", "r");
  pthread_t* thread;  // pointer to a group of threads
for(int ini=0; ini<100; ini++)
{


  int i;

  if (argc!=2)
  {
    printf("Usage: %s number_of_threads\n",argv[0]);
    exit(-1);
  }

  num_thrd = atoi(argv[1]);
  init_matrix(A);
  init_matrix(B);
  clock_t start = clock();
  thread = (pthread_t*) malloc(num_thrd*sizeof(pthread_t));
  // this for loop not entered if threadd number is specified as 1
  for (i = 1; i < num_thrd; i++)
  {
    // creates each thread working on its own slice of i
    if (pthread_create (&thread[i], NULL, multiply, (void*)i) != 0 )
    {
      perror("Can't create thread");
      free(thread);
      exit(-1);
    }
  }

  // main thread works on slice 0
  // so everybody is busy
  // main thread does everything if threadd number is specified as 1
  multiply(0);

  // main thead waiting for other thread to complete
  for (i = 1; i < num_thrd; i++)
    pthread_join (thread[i], NULL);

  clock_t end = clock();

  float time = (end - start)*1000 / CLOCKS_PER_SEC;
  fprintf(outFile,"time taken for Multiplication using %d", num_thrd);
  fprintf(outFile," threads =  %f", time);
  fprintf(outFile," milliseconds \n");
  if (thread != NULL)
  {
      free(thread);
      thread = NULL;
  }
  SIZE += 10;

 }

  printf("calculation completed.\n\n");
  return 0;

}

Upvotes: 0

Views: 759

Answers (1)

erik258
erik258

Reputation: 16275

C is a hard language, especially because runtime errors by default include no useful debugging information. This is where a debugger comes in.

How I debugged this with docker/alpine:

  1. put the code into ~/gcc/t.c for sharing to my docker
  2. docker run --rm -it -vls -d ~/gcc:/code alpine
  3. apk add build-base gdb musl-dbg to install gcc & friends, gdb, and musl-dbg which I'll need to debug the segfault in the standard library.
  4. cd /code
  5. gcc -g t.c -o t
  6. gdb t

Now here's my debugger session:

GNU gdb (GDB) 8.0.1

[ ... preamble removed for brevity ... ]

Reading symbols from t...done.
(gdb) run 1
Starting program: /code/t 1
warning: Error disabling address space randomization: Operation not permitted
computing slice 0 (from row 0 to 9)
finished slice 0

Program received signal SIGSEGV, Segmentation fault.
vfprintf (f=0x0, fmt=0x560896553020 "time taken for Multiplication using %d",
    ap=ap@entry=0x7ffcb18b29f8) at src/stdio/vfprintf.c:671
671 src/stdio/vfprintf.c: No such file or directory.
(gdb) bt
#0  vfprintf (f=0x0,
    fmt=0x560896553020 "time taken for Multiplication using %d",
    ap=ap@entry=0x7ffcb18b29f8) at src/stdio/vfprintf.c:671
#1  0x00007fa6ef0c056f in fprintf (f=<optimized out>, fmt=<optimized out>)
    at src/stdio/fprintf.c:9
#2  0x0000560896552ee2 in main (argc=2, argv=0x7ffcb18b2b68) at t.c:87

Great, now we have a line number to go see. Line 87 is this line:

fprintf(outFile,"time taken for Multiplication using %d", num_thrd);

We can take a look at the line, do a little thinking, and eventually come to look at the definition of outFile:

outFile = fopen("Algorithm3_Times.txt", "r");

Ah ha! We expect to write to outFile, but we opened it for reading! I change to opening for writing:

  outFile = fopen("Algorithm3_Times.txt", "w");

And your program runs (note below I only show the first 10 lines)

/code # ./t 10  |head
computing slice 1 (from row 1 to 1)
finished slice 1
computing slice 2 (from row 2 to 2)
computing slice 3 (from row 3 to 3)
finished slice 3
finished slice 2
computing slice 4 (from row 4 to 4)
finished slice 4
computing slice 5 (from row 5 to 5)
finished slice 5

Now that you've gotten introduced to gdb, you can start working on putting in gdb break statements at particular lines and work through any remaining bugs. My program didn't seem to ever finish but I didn't give it very much time.

Upvotes: 1

Related Questions