Reputation: 326
I need to create groups of rows from my dataframe using a custom function as grouping criteria. That function would compare two pairs of rows and returns true/false if those rows should be grouped together.
In an example dataset like:
id field code1 code2
1 textField1 055 066
2 textField2 100 120
3 textField3 300 350
4 textField4 800 450
5 textField5 460 900
6 textField6 490 700
...
The function checks certain rules between the row fields by pair (function(row1,row2)) and returns TRUE / FALSE if those rows should be together.
I need to apply that function to all posible pairs in the dataframe and generate a list (or other structure) with all ID that matched to be together.
One way to apply the function to each pair is shown in this answer :
lapply(seq_len(nrow(df) - 1),
function(i){
customFunction( df[i,], df[i+1,] )
})
But I cannot think a way to group the rows that got TRUE as result
EDIT: Re-reading my question, seems in the need of an example:
If we created a matrix with all the posible combinations, the result would be:
[,1] [,2] [,3] [,4] [,5] [,6]
[1,] TRUE FALSE FALSE FALSE FALSE FALSE
[2,] FALSE TRUE TRUE TRUE FALSE FALSE
[3,] FALSE TRUE TRUE FALSE FALSE FALSE
[4,] FALSE TRUE FALSE TRUE FALSE FALSE
[5,] FALSE FALSE FALSE FALSE TRUE TRUE
[6,] FALSE FALSE FALSE FALSE TRUE TRUE
The resulting groups would be then:
1
2,3,4
5,6
Upvotes: 2
Views: 1321
Reputation: 10167
Here's a function that does what you've specified:
mx <- matrix(c( TRUE,FALSE,FALSE,FALSE,FALSE,FALSE,
FALSE,TRUE,TRUE,TRUE,FALSE,FALSE,
FALSE,TRUE,TRUE,FALSE,FALSE,FALSE,
FALSE,TRUE,FALSE,TRUE,FALSE,FALSE,
FALSE,FALSE,FALSE,FALSE,TRUE,TRUE,
FALSE,FALSE,FALSE,FALSE,TRUE,TRUE),6)
groupings <- function(mx){
out <- list()
while(dim(mx)[1]){
# get the groups that match the first column
g = which(mx[,1])
# expand the selection to any columns for which
# there is match in the first row
(expansion = which(apply(cbind(mx[,g]),1,any)))
while(length(expansion) > length(g)){
g = expansion
# expand the selection to any columns for which
# there is match to the current group
expansion = which(apply(cbind(mx[,g]),1,any))
}
out <- c(out,list(g))
mx <- mx[-g,-g]
}
return(out)
}
groupings(mx)
#> [[1]]
#> [1] 1
#>
#> [[2]]
#> [1] 1 2 3
#>
#> [[3]]
#> [1] 1 2
Upvotes: 1