Andre_k
Andre_k

Reputation: 1730

Get the subset of values that forms a sum in R

Suppose this is my dataframe df

ID  COL1
a1  12
a2  12
a3  1
a4  5
a5  10
a6  5
a7  5

What I expect is the subset of values that makes up a number for e.g Suppose I have a number 25 and I want to check which all values from COL1 of df should be taken so that it would sum up to 25. so adding values of (a1+a2+a3) = 25 ;; (a4+a5+a6+a7) = 25, all such possibility that would sum to 25. But there's a condition that the ID's should not be adding to itself to produce the result such that (a1+a1+a3);; or (a5+a5+a6).

This is what I tried

df$ID[seq(which(cumsum(df$COL1) == 25))]

But this just gives me a1,a2,a3 only.

Upvotes: 0

Views: 370

Answers (2)

jsv
jsv

Reputation: 740

library(gtools)

#Replace 7 with the number of rows in your column
a <- c(12,12,1,5,10,5,5) #df$COL1
binary <- permutations(n=2,r=7,v=c(0,1),repeats.allowed = T)

mult <-   binary %*% a
indices <- which(mult==25)

Hope this solves your problem.
Edit:

colnames(binary) <- df$ID
as.matrix(apply(binary[indices,]==1,1,function(a) paste0(colnames(binary)[a], collapse = "")))

Does this work?

Upvotes: 1

abichat
abichat

Reputation: 2416

You can try this

ID <- paste0(rep("a", 7), 1:7)
COL1 <- c(12, 12, 1, 5, 10, 5, 5)
df <- data.frame(ID, COL1)


for(i in 1:7){
  comb <- combn(1:7, i, FUN = NULL, simplify = TRUE)
  for (j in 1:ncol(comb)){
    subvec <- comb[,j]
    a <- sum(df[subvec,2])
    if(a == 25){
      print(df[subvec,1])
    }
  }
}

It gives this output :

[1] a1 a2 a3
Levels: a1 a2 a3 a4 a5 a6 a7
[1] a4 a5 a6 a7
Levels: a1 a2 a3 a4 a5 a6 a7

Upvotes: 1

Related Questions