DivM
DivM

Reputation: 137

How to run a method in parallel using Julia?

I was reading Parallel Computing docs of Julia, and having never done any parallel coding, I was left wanting a gentler intro. So, I thought of a (probably) simple problem that I couldn't figure out how to code in parallel Julia paradigm.

Let's say I have a matrix/dataframe df from some experiment. Its N rows are variables, and M columns are samples. I have a method pwCorr(..) that calculates pairwise correlation of rows. If I wanted an NxN matrix of all the pairwise correlations, I'd probably run a for-loop that'd iterate for N*N/2 (upper or lower triangle of the matrix) and fill in the values; however, this seems like a perfect thing to parallelize since each of the pwCorr() calls are independent of others. (Am I correct in thinking this way about what can be parallelized, and what cannot?)

To do this, I feel like I'd have to create a DArray that gets filled by a @parallel for loop. And if so, I'm not sure how this can be achieved in Julia. If that's not the right approach, I guess I don't even know where to begin.

Upvotes: 5

Views: 563

Answers (1)

JKnight
JKnight

Reputation: 2006

This should work, first you need to propagate the top level variable (data) to all the workers:

 for pid in workers()
       remotecall(pid, x->(global data; data=x; nothing), data)
       end

then perform the computation in chunks using the DArray constructor with some fancy indexing:

corrs = DArray((20,20)) do I
         out=zeros(length(I[1]),length(I[2]))
         for i=I[1], j=I[2]
           if i<j 
             out[i-minimum(I[1])+1,j-minimum(I[2])+1]= 0.0
           else
             out[i-minimum(I[1])+1,j-minimum(I[2])+1] = cor(vec(data[i,:]), vec(data[j,:]))
           end
         end
         out 
       end

In more detail, the DArray constructor takes a function which takes a tuple of index ranges and returns a chunk of the resulting matrix which corresponds to those index ranges. In the code above, I is the tuple of ranges with I[1] being the first range. You can see this more clearly with:

julia> DArray((10,10)) do I
       println(I)
       return zeros(length(I[1]),length(I[2]))
       end
        From worker 2:  (1:10,1:5)
        From worker 3:  (1:10,6:10)

where you can see it split the array into two chunks on the second axis.

The trickiest part of the example was converting from these 'global' index ranges to local index ranges by subtracting off the minimum element and then adding back 1 for the 1 based indexing of Julia. Hope that helps!

Upvotes: 3

Related Questions