Reputation: 69
I'm trying to run a simple program using OpenMP
the program is as follows
#include <iostream>
#include <fstream>
#include <vector>
#include <omp.h>
#include <algorithm>
#include <math.h>
#include <map>
#include <string>
#include <ctime>
using namespace std;
#define NUM 10
void openMP()
{
omp_set_num_threads(1);
int sum =0;
#pragma omp parallel for shared(sum)
{
for (int i=0;i<100;i++)
{
sum++;
}
}
cout<<"sum = "<<sum<<endl;
}
int main()
{
cout<<"Open MP \n";
openMP();
return 0;
}
Now when I compile it using
g++ test.cpp -fopenmp -o test
and run it on ubuntu terminal
./test
the output is correct - I think - as follows
Open MP
sum = 100
But when I try to run it using Multi2sim using these 2 files I was given by my instructor
multicore-config:
[ General ]
Cores = 4
Threads = 1
multicore-mem-config:
[CacheGeometry geo-l1]
Sets = 256
Assoc = 2
BlockSize = 64
Latency = 2
Policy = LRU
Ports = 2
[CacheGeometry geo-l2]
Sets = 512
Assoc = 4
BlockSize = 64
Latency = 20
Policy = LRU
Ports = 4
[Module mod-l1-0]
Type = Cache
Geometry = geo-l1
LowNetwork = net-l1-l2
LowModules = mod-l2
[Module mod-l1-1]
Type = Cache
Geometry = geo-l1
LowNetwork = net-l1-l2
LowModules = mod-l2
[Module mod-l2]
Type = Cache
Geometry = geo-l2
HighNetwork = net-l1-l2
LowNetwork = net-l2-mm
LowModules = mod-mm
[Module mod-mm]
Type = MainMemory
BlockSize = 256
Latency = 200
HighNetwork = net-l2-mm
[Network net-l2-mm]
DefaultInputBufferSize = 1024
DefaultOutputBufferSize = 1024
DefaultBandwidth = 256
[Network net-l1-l2]
DefaultInputBufferSize = 1024
DefaultOutputBufferSize = 1024
DefaultBandwidth = 256
[Entry core-0]
Arch = x86
Core = 0
Thread = 0
DataModule = mod-l1-0
InstModule = mod-l1-0
[Entry core-1]
Arch = x86
Core = 1
Thread = 0
DataModule = mod-l1-0
InstModule = mod-l1-0
[Entry core-2]
Arch = x86
Core = 2
Thread = 0
DataModule = mod-l1-0
InstModule = mod-l1-0
[Entry core-3]
Arch = x86
Core = 3
Thread = 0
DataModule = mod-l1-0
InstModule = mod-l1-0
And then using this instruction in Ubuntu terminal
m2s --x86-config multicore-config.txt --mem-config multicore-mem-config.txt --x86-sim detailed test
I get the output
; Multi2Sim 4.0.1 - A Simulation Framework for CPU-GPU Heterogeneous Computing
; Please use command 'm2s --help' for a list of command-line options.
; Last compilation: May 8 2013 10:01:31
Open MP
sum = 83
;
; Simulation Statistics Summary
;
[ General ]
Time = 53.17
SimEnd = ContextsFinished
Cycles = 3691870
[ x86 ]
SimType = Detailed
Time = 53.15
Contexts = 4
Memory = 37056512
EmulatedInstructions = 3292450
EmulatedInstructionsPerSecond = 61943
Cycles = 3691558
CyclesPerSecond = 69452
FastForwardInstructions = 0
CommittedInstructions = 2081157
CommittedInstructionsPerCycle = 0.5638
CommittedMicroInstructions = 3113721
CommittedMicroInstructionsPerCycle = 0.8435
BranchPredictionAccuracy = 0.9375
why is the output in Multi2sim 83
while the output in a normal run is 100
Also why does it take so much time to run on Multi2Sim ?
Any help would be appreciated.
Upvotes: 0
Views: 637
Reputation: 8042
I don't really know m2s
, but it could be the case that the culprit is:
#pragma omp parallel for shared(sum)
{
for (int i=0;i<100;i++)
{
sum++; // Concurrent access to a shared variable!!!
}
}
In your first test the fact that you explicitly set the number of threads to 1
:
omp_set_num_threads(1);
saves you from race conditions. I would suggest trying with:
#pragma omp parallel for shared(sum) reduction(+:sum)
for (int i=0;i<100;i++) {
sum++;
}
to see if you can obtain the desired behavior.
Upvotes: 1