J. Doe
J. Doe

Reputation: 619

How to get the absolute difference between values in two columns in a matrix

I'm having a matrix like the following

      i j value
 [1,] 3 6 0.194201129
 [2,] 3 5 0.164547043
 [3,] 3 4 0.107149279
 [4,] 4 3 0.004927017
 [5,] 3 1 0.080454448
 [6,] 1 2 0.003220612
 [7,] 2 6 0.162313646
 [8,] 3 3 0.114992628
 [9,] 4 1 0.015337253
[10,] 1 6 0.026550051
[11,] 3 2 0.057004116
[12,] 4 2 0.006441224
[13,] 4 5 0.025641026
[14,] 2 4 0.004885993
[15,] 1 1 0.036552785
[16,] 1 5 0.048249186
[17,] 1 4 0.006053565
[18,] 1 3 0.004970296

As you can see for some i, j pairs there is an inverse pair. For example for i = 3, j = 1 , there is a pair with i = 1, j = 3.

Here is what I want to achieve.

For every i, j pair to subtract its inverse value and get the absolute value of the subtraction. For those pairs that have no inverse pair, 0 is subtracted from them.

Here are a couple of examples:

For i = 3, j = 5 there is no inverse pair (i = 5, j = 3) and thus the calculation becomes:

abs(0.164547043 - 0)

For i = 3, j = 1 there is an inverse pair on the matrix with i = 1, j = 3 and thus the calculation is going to be :

abs(0.004970296 - 0.080454448)

I approached this, by writing a bunch of code (65 lines) full of for loops and it's hard to read and be edited.

So I was wondering if there is another more efficient way to do something like that, by using more compact functions.

Motivated by a previous post where its answer was pretty simple (by using the aggregate() function) and by searching online for those functions, I'm trying to use here the mapply(), but the truth is that I cannot handle the inverse pairs.

EDIT:

dput()
    memMatrix <- structure(c(3, 3, 3, 4, 3, 1, 2, 3, 4, 1, 3, 4, 4, 2, 1, 1, 1, 
        1, 6, 5, 4, 3, 1, 2, 6, 3, 1, 6, 2, 2, 5, 4, 1, 5, 4, 3, 0.194201128983738, 
        0.164547043451226, 0.107149278958536, 0.00492701677834917, 0.0804544476798398, 
        0.00322061191626409, 0.162313646044361, 0.114992627755601, 0.0153372534398016, 
        0.0265500506171091, 0.0570041160347523, 0.00644122383252818, 
        0.0256410256410256, 0.00488599348534202, 0.0365527853282693, 
        0.0482491856677524, 0.0060535654765406, 0.00497029586494912), .Dim = c(18L, 
        3L), .Dimnames = list(NULL, c("i", "j", "value")))

Also here is the code that so far works but it is a lot more complicated

Where memMatrix is the matrix given on top of the post. And here you cans see a little difference that I'm multiplying the absolut value with a variable called probability_distribution, but that's doesn't really matter. I through it away (the multiplcation) from the initial post to make it more simple.

subFunc <- function( memMatrix , probability_distribution )
{

  # Node specific edge relevance matrix
  node_edgeRelm <- matrix(ncol = 3)
  colnames(node_edgeRelm) <- c("i","j","rel")
  node_edgeRelm <- na.omit(node_edgeRelm)

  for ( row in 1:nrow( memMatrix ) )
  {
    pair_i <- memMatrix[row,"i"]
    pair_j <- memMatrix[row,"j"]

    # If already this pair of i and j has been calculated continue with the next pair
    # At the end of a new calculation, we store the i,j (verse) values in order from lower to higher
    # and then we check here for the inverse j,i values (if exists).
    if( pair_i < pair_j )
      if( any(node_edgeRelm[,"i"] == pair_i & node_edgeRelm[,"j"] == pair_j) ) next
    if( pair_j < pair_i )
      if( any(node_edgeRelm[,"i"] == pair_j & node_edgeRelm[,"j"] == pair_i) ) next

    # Verse i,j
    mepm_ij <- as.numeric( memMatrix[which( memMatrix[,"i"] == pair_i & memMatrix[,"j"] == pair_j ), "mep"] )
    if( length(mepm_ij) == 0 )
      mepm_ij <- 0
    # Inverse j,i
    mepm_ji <- as.numeric( memMatrix[which( memMatrix[,"i"] == pair_j & memMatrix[,"j"] == pair_i ), "mep"] )
    if( length(mepm_ji) == 0 )
      mepm_ji <- 0

    # Calculate the edge relevance for that specific initial node x and pair i,j
    edge_relevance <- probability_distribution * abs( mepm_ij - mepm_ji )

    # Store that specific edge relevance with an order from lower to higher node
    if ( pair_i < pair_j)
      node_edgeRelm <- rbind( node_edgeRelm, c( as.numeric(pair_i), as.numeric(pair_j), as.numeric(edge_relevance) ) )
    else
      node_edgeRelm <- rbind( node_edgeRelm, c( as.numeric(pair_j), as.numeric(pair_i), as.numeric(edge_relevance) ) )
  }

  na.omit(node_edgeRelm)
}

you can run it as subFunc(memMatrix, 1/3)

Upvotes: 4

Views: 4187

Answers (4)

Roman
Roman

Reputation: 17648

You can try a tidyverse solution:

library(tidyverse)
df %>% as.tibble() %>% 
  rowwise() %>% 
  mutate(id=paste(sort(c(i,j)), collapse = "_"))  %>% 
  group_by(id) %>% 
  mutate(n=paste0("n", 1:n())) %>% 
  select(-1,-2) %>% 
  spread(n, value, fill = 0) %>% 
  mutate(result=abs(n1-n2))
# A tibble: 14 x 4
# Groups:   id [14]
      id          n1          n2      result
   <chr>       <dbl>       <dbl>       <dbl>
 1   1_1 0.036552785 0.000000000 0.036552785
 2   1_2 0.003220612 0.000000000 0.003220612
 3   1_3 0.080454448 0.004970296 0.075484152
 4   1_4 0.015337253 0.006053565 0.009283688
 5   1_5 0.048249186 0.000000000 0.048249186
 6   1_6 0.026550051 0.000000000 0.026550051
 7   2_3 0.057004116 0.000000000 0.057004116
 8   2_4 0.006441224 0.004885993 0.001555230
 9   2_6 0.162313646 0.000000000 0.162313646
10   3_3 0.114992628 0.000000000 0.114992628
11   3_4 0.107149279 0.004927017 0.102222262
12   3_5 0.164547043 0.000000000 0.164547043
13   3_6 0.194201129 0.000000000 0.194201129
14   4_5 0.025641026 0.000000000 0.025641026

The idea is:

  1. Sort rowwise i and j and paste together in a new column id.
  2. Group by id and add number of occurences n
  3. Spread by n
  4. calculate the absolute difference.

Upvotes: 2

G. Grothendieck
G. Grothendieck

Reputation: 269664

Assuming that the input is matrix m group the value elements by those that have the same i, j or j, i. There will either be 1 or 2 value elements in each such group so for any specific group append a zero to that 1 or 2 length vector and take the first 2 elements, difference the elements of the resulting 2 element vector and take the absolute value. This procedure does not change the row order. It gives a data frame but it could be converted back to a matrix if need be using as.matrix. No packages are used.

absdiff <- function(x) abs(diff(c(x, 0)[1:2]))
transform(m, value = ave(value, pmin(i, j), pmax(i, j), FUN = absdiff))

giving:

   i j       value
1  3 6 0.194201129
2  3 5 0.164547043
3  3 4 0.102222262
4  4 3 0.102222262
5  3 1 0.075484152
6  1 2 0.003220612
7  2 6 0.162313646
8  3 3 0.114992628
9  4 1 0.009283688
10 1 6 0.026550051
11 3 2 0.057004116
12 4 2 0.001555230
13 4 5 0.025641026
14 2 4 0.001555230
15 1 1 0.036552785
16 1 5 0.048249186
17 1 4 0.009283688
18 1 3 0.075484152

Upvotes: 4

Vlo
Vlo

Reputation: 3188

Here is a solution with library(purr) to make match() work on lists

library(purrr)

Create a match that operates on lists

match2 = as_mapper(match)

Create a list containing vectors with length 2 containing the two values, then second list with the values reversed, then match the two lists

i = match2(L <- map2(df[,1], df[,2], c),
                map(L, rev))

Extract third column of the matched indices

 v = df[i,3]

Replace the NA/unmatched with 0, do the subtraction then abs()

cbind(df, abs(df[,3]-replace(v, is.na(v), 0)))

Upvotes: 3

Onyambu
Onyambu

Reputation: 79238

Base R: Lte say the name of your matrix is mat

> B=matrix(0,max(mat[,1:2]),max(mat[,1:2]))
> B[mat[,1:2]]=mat[,3]
> A=cbind(which(upper.tri(B,T),T),abs(`diag<-`(B,0)[upper.tri(B,T)]-t(B)[upper.tri(B,T)]))
> A[A[,3]>0,]
      row col            
 [1,]   1   1 0.036552785
 [2,]   1   2 0.003220612
 [3,]   1   3 0.075484152
 [4,]   2   3 0.057004116
 [5,]   3   3 0.114992628
 [6,]   1   4 0.009283688
 [7,]   2   4 0.001555230
 [8,]   3   4 0.102222262
 [9,]   1   5 0.048249186
[10,]   3   5 0.164547043
[11,]   4   5 0.025641026
[12,]   1   6 0.026550051
[13,]   2   6 0.162313646
[14,]   3   6 0.194201129

Upvotes: 1

Related Questions