Torstein Norum Bugge
Torstein Norum Bugge

Reputation: 56

How to transpose a matrix in java (parallel/multithreaded)

We all know how useful matrix transposition is, and writing a general algorithm for sequential use is no problem. However, I am having some trouble doing the same for multithreaded purposes, and have only gotten this small case for 4 x 4 to work properly.

My approach is to assign equal parts of a double[][] structure, in this case 2 x 2, to each out of four threads. In this case, that means starting positions of 0,0 & 0,2 & 2,0 & 2,2. This is passed in with "kol" and "rad".

However, I cannot get this to work on larger matrices, so any help would be appreciated. The closest answer to this problem I've found is here: How to to parallelize the matrix transpose?

This is also my inspiration for splitting the double[][] structure into four parts. My (working) 4 x 4 code can be found below, so how can I modify it to work for four threads?

Cheers!

public double[][] transponerMatrise(double[][] matrise, int rad, int 
kol, int id)
{
  if((id != 2))
   {
      for (int i = rad; i < n/2 + rad; i++)
      {
        for (int j = kol+1; j < n/2 + kol; j++)
        {
          System.out.println("Traad " + id + " bytter " + i + "," + j + " med " + j + "," + i);
          System.out.println("Value is " + matrise[i][j] + ", " + matrise[j][i]);
            element = matrise[i][j];
            matrise[i][j] = matrise[j][i];
            matrise[j][i] = element;
        }
      }
    }
    else
    {
      for (int i = rad; i < n/2 + rad-1; i++)
      {
        for (int j = kol; j < n/2 + kol; j++)
        {
          System.out.println("Traad " + id + " bytter " + i + "," + j + " med " + j + "," + i);
          System.out.println("Value is " + matrise[i][j] + ", " + matrise[j][i]);
            element = matrise[i][j];
            matrise[i][j] = matrise[j][i];
            matrise[j][i] = element;
        }
      }
    }
    return matrise;
}

PS: I know the code works properly, because I have a checking method against a working sequential variant.

Upvotes: 2

Views: 1142

Answers (3)

IEE1394
IEE1394

Reputation: 1261

based on the aproach of user3666197 you could do something like that:

public class Matrix {

        private class Index {
            Index(final int row, final int col) {
                super();
                this.row = row;
                this.col = col;
            }

            int col;
            int row;
        }

        public Matrix(final int rows, final int cols) {
            this.rows = rows;
            this.cols = cols;
            data = new int[rows][cols];
        }

        public int get(final int row, final int col) {
            return get(getIndex(row, col));
        }

        public void set(final int row, final int col, final int value) {
            set(getIndex(row, col), value);
        }

        public void transpose() {
            transpositioned = !transpositioned;
        }

        private int get(final Index index) {
            return data[index.row][index.col];
        }

        private Index getIndex(final int row, final int col) {
            return transpositioned ? new Index(col, row) : new Index(row, col);
        }

        private void set(final Index index, final int value) {
            data[index.row][index.col] = value;
        }

        private final int cols;
        private final int[][] data;
        private final int rows;
        private boolean transpositioned;

    }

Upvotes: 0

IEE1394
IEE1394

Reputation: 1261

Maybe it could be an simple idea to use a thread to swap a row to col but therefore you need twice the memory for you matrix which could be a problem on huge matrices. also if you have a 6 Core CPU i think there is not much benefit from using 100 thread so my thread pool is very small. And as @user3666197 mentioned its a still a expensive solution - but paralell ;-)

import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class MatrixTransposition {

    public static void main(final String[] args) throws InterruptedException {
        final MatrixTransposition transposition = new MatrixTransposition();
        final int[][] source = transposition.create(32);
        final int[][] transposed = transposition.solve(source);
        System.out.println("Compare source and transpositon = " + transposition.compare(source, transposed));
        final int[][] result = transposition.solve(transposed);
        System.out.println("Compare source and double transpositon = " + transposition.compare(source, result));

        transposition.print(source);
        transposition.print(transposed);
    }

    public boolean compare(final int[][] a, final int[][] b) {
        for (int r = 0; r < a.length; r++) {
            for (int c = 0; c < a[0].length; c++) {
                if (a[r][c] != b[r][c]) return false;
            }
        }
        return true;
    }

    public int[][] create(final int size) {
        final int[][] result = new int[size][size];
        for (int r = 0; r < size; r++) {
            for (int c = 0; c < size; c++) {
                result[r][c] = r * size + c;
            }
        }
        return result;
    }

    public void print(final int[][] input) {
        final int size = input.length;
        final int maxNr = size * size;
        final int digits = new String(maxNr + "").length();
        final String cellFormat = "%0" + digits + "d ";

        for (int r = 0; r < input.length; r++) {
            final int[] row = input[r];
            for (final int c : row) {
                System.out.print(String.format(cellFormat, c));
            }
            System.out.println("");
        }

        System.out.println("");
    }

    public int[][] solve(final int[][] input) throws InterruptedException {
        final int width = input.length;
        final int height = input[0].length;

        final int[][] result = new int[width][height];
        final CountDownLatch latch = new CountDownLatch(width);
        for (int r = 0; r < width; r++) {
            final int row = r;
            threadPool.execute(() -> {
                solvePart(result, input, row);
                latch.countDown();
            });
        }

        latch.await();
        return result;
    }

    private void solvePart(final int[][] result, final int[][] input, final int r) {
        System.out.println("Solve row " + String.format("%02d", r) + " in thread " + Thread.currentThread().getName());
        final int[] row = input[r];
        for (int c = 0; c < row.length; c++) {
            result[c][r] = row[c];
        }
    }
    private final ExecutorService threadPool = Executors.newFixedThreadPool(6);
}

Upvotes: 1

user3666197
user3666197

Reputation: 1

This is an a priori lost fight against O(N^2) costs-function

If I may turn your attention to a smarter approach, having an almost O(1) ( constant ) cost, the trick would help you start working in a way more promising direction.

May try to add one thin abstraction layer above the high-performance tuned ( cache-lines friendly "raw"-storage of the matrix elements. This abstracted layer will help to access the "raw"-storage ( using indexing-axes, index-striding and slicing tricks - the very same way as HPC FORTRAN libraries have inspired the well-known numpy and it's striding tricks) and this way the .T-method will do nothing expensive ( alike bulldozing N^2 memory locations, swapping there and back ) but simply swap axes-parameters in the abstract layer, responsible for the indirection mapper, taking a few tens of nanoseconds for whatever large matrise[N,N]; where N = 10, 100, 1000, 10000, 100000, 1000000, 10000000+ still a few [ns].

There is nothing faster in doing this or other, more complex, matrix operations and both the FORTRAN and numpy, performance polished, methods are a proof per-se of such observation.

Upvotes: 2

Related Questions