Abhishek Govilkar
Abhishek Govilkar

Reputation: 27

Difference between 2 colums

Need your help in getting desired difference between two columns in my data-set. Sample of my data-set is as below for your reference:

User    COL-A   COL-B   Difference
10050   1360    1330    30
10051   1160    1150    10
10052   1150    
10053   1175    1170    5
10054   1175        
10055   1175        
10056   1175    1170    5
10057   1175    1170    5
10058   1170    
10059   1040    1030    10
10060   1060    
10061   1080    1060    20
10062   1100    
10063   1130    1100    30
10064   1130    1100    30
10065   1100    
10066   1130    1100    30
10067   1130    1100    30
10068   1100    
10069   1130    1100    30
10070   1130    1100    30
10071   1130        
10072   1130    1100    30
10073   1130        
10074   1130    1100    30
10075   1130    1100    30
10076   1130    1100    30
10077   1130    1100    30
10078   1130    1100    30
10079   1130    

My data-set has two main columns Col-A and Col-B and I want to have difference plotted in Third columns. but while trying to with following code:

MOP_NEW$Difference <- MOP_NEW$COl-A - MOP_NEW$Col-B

This code is also considering blank values of column A and B respectively while giving output for Difference column. My intentions is only to do subtraction if there is value present in Col-A and Col-B respectively and return blank (NULL) value in case there is no value for either of Col-A or Col-B.

Hope I am able to explain my problem in simple terms.

Thanks in advance.

Abhishek

Upvotes: 0

Views: 44

Answers (2)

Rui Barradas
Rui Barradas

Reputation: 76615

Here is a base R way. It checks each row for at least one non NA. If one and only one value is NA it assigns zero to it. Then computes the difference COL_A - COL_B.

MOP_NEW$Difference <- apply(MOP_NEW[2:3], 1, function(x){
  na <- is.na(x)
  if(all(na)){
    NA
  }else{
    x[na] <- 0
    x[1] - x[2]
  }
})
MOP_NEW

Data.

MOP_NEW <- read.table(text = "
User    COL_A   COL_B   Difference
10050   1360    1330    30
10051   1160    1150    10
10052   1150    
10053   1175    1170    5
10054   1175        
10055   1175        
10056   1175    1170    5
10057   1175    1170    5
10058   1170    
10059   1040    1030    10
10060   1060    
10061   1080    1060    20
10062   1100    
10063   1130    1100    30
10064   1130    1100    30
10065   1100    
10066   1130    1100    30
10067   1130    1100    30
10068   1100    
10069   1130    1100    30
10070   1130    1100    30
10071   1130        
10072   1130    1100    30
10073   1130        
10074   1130    1100    30
10075   1130    1100    30
10076   1130    1100    30
10077   1130    1100    30
10078   1130    1100    30
10079   1130    
", header = TRUE, fill = TRUE)

Upvotes: 1

Mike V
Mike V

Reputation: 1364

Supposing you have a data frame

df <- data.frame(COL_A = c(10050,1360, 1330, 30, 10051, 1160, 1150, 10, 10052, 1150,
                           10053, 1175, 1170, 5, 10054, 1175),
                 COL_B = c(10052,1364, 1335, 10, 10021, 1130, 1110, 50, 10012, 1110,
                           10043, 1125, 1130, 2, 10034, 1145))

df$difference <- df$COL_A - df$COL_B

Output:

  COL_A COL_B   difference
1  10050 10052        -2
2   1360  1364        -4
3   1330  1335        -5
4     30    10        20
5  10051 10021        30
6   1160  1130        30
7   1150  1110        40
8     10    50       -40
9  10052 10012        40
10  1150  1110        40
11 10053 10043        10
12  1175  1125        50
13  1170  1130        40
14     5     2         3
15 10054 10034        20
16  1175  1145        30

Upvotes: 0

Related Questions