Al3xEP
Al3xEP

Reputation: 328

Replace two for loops using tidyverse (dplyr and purrr)

I have a tibble

raw.tb
#> # A tibble: 10 x 4
#>       geno    ind     X     Y
#>  *  <fctr> <fctr> <int> <int>
#>  1 san1w16     A1   467   383
#>  2 san1w16     A1   465   378
#>  3 san1w16     B1   464   378
#>  4 san1w16     B1   464   377
#>  5 san2w16     A1   464   376
#>  6 san2w16     A1   464   375
#>  7 san2w16     B1   463   375
#>  8 san2w16     B1   463   374
#>  9 san3w16     A1   463   373
#> 10 san3w16     A1   463   372

I would like to replace the usage of two for loops using tidyverse. I am working with a function that requires a 2x2 matrix (It could be any function - in this specific case it is momocs::coo_rotate).

What i wish to do can be written like that in base R:

for(g in unique(raw.tb$geno)){
   for(i in unique(raw.tb[raw.tb$geno == g,]$ind){     
      raw.tb[raw.tb$geno == g & raw.tb$ind == i,c(3,4)] = some.function.for.a.matrix(raw.tb[raw.tb$geno == g & raw.tb$ind == i,c(3,4)])
   }
   }

I'm guessing this can be done using the tidyverse, however I have looked at using group_by() with do(), and nest with map and I cannot make it work.

Upvotes: 0

Views: 768

Answers (2)

Al3xEP
Al3xEP

Reputation: 328

I found it. Hope my answer will make things clearer. My apologies to @EricFail for not making this clearer

Basically I wrote a function which given a matrix of x,y coordinates rotates the coordinates using the first and last points as baseline. I am not detailing that function since it is long and is not the point here, but basically, the function is of the type:

rotate.coord <- function(mat){
  for(i in 1:dim(mat)[1]{
    x1=(dim(coord.rot)[1])
    x2=1
    .
    .
    (theta is computed based on x1 and x2)
    .
    .
    xn=mat[z,1]*cos(theta)+mat[z,2]*sin(theta)
    yn=-mat[z,1]*sin(theta)+mat[z,2]*cos(theta)
    mat[z,1]=xn
    mat[z,2]=yn
    }
    mat = as_tibble(mat)
    return(mat)
}

Having:

raw.tb
#> # A tibble: 10 x 4
#>       geno    ind     X     Y
#>  *  <fctr> <fctr> <int> <int>
#>  1 san1w16     A1   467   383
#>  2 san1w16     A1   465   378
#>  3 san1w16     B1   464   378
#>  4 san1w16     B1   464   377
#>  5 san2w16     A1   464   376
#>  6 san2w16     A1   464   375
#>  7 san2w16     B1   463   375
#>  8 san2w16     B1   463   374
#>  9 san3w16     A1   463   373
#> 10 san3w16     A1   463   372

I wanted to do

raw.nt <- raw.tb %>% 
 group_by(geno,ind) %>% 
 nest()

raw.nt2 <- raw.nt %>% 
  mutate(rot = map(data,rotate.coo))

Which creates a new nested tibble where the raw.nt2$rot for each group is the rotated matrix from each group of raw.nt$data

Upvotes: 0

Eric Fail
Eric Fail

Reputation: 7928

I am somewhat guessing as it's not clear to me exactly what you are looking to do.

raw.tb <- structure(list(geno = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L), .Label = "san1w16", class = "factor"), ind = structure(c(1L, 
1L, 2L, 2L, 3L, 3L, 4L, 4L, 5L, 1L), .Label = c("A1", "B1", "C1", 
"D1", "E1"), class = "factor"), X = c(467L, 465L, 464L, 464L, 
464L, 464L, 463L, 463L, 463L, 463L), Y = c(383L, 378L, 378L, 
377L, 376L, 375L, 375L, 374L, 373L, 372L)), .Names = c("geno", 
"ind", "X", "Y"), row.names = c("1", "2", "3", "4", "5", "6", 
"7", "8", "9", "10"), class = c("tbl_df", "tbl", "data.frame"
)) %>% as_tibble(); raw.tb 
#> # A tibble: 10 x 4
#>       geno    ind     X     Y
#>  *  <fctr> <fctr> <int> <int>
#>  1 san1w16     A1   467   383
#>  2 san1w16     A1   465   378
#>  3 san1w16     B1   464   378
#>  4 san1w16     B1   464   377
#>  5 san1w16     C1   464   376
#>  6 san1w16     C1   464   375
#>  7 san1w16     D1   463   375
#>  8 san1w16     D1   463   374
#>  9 san1w16     E1   463   373
#> 10 san1w16     A1   463   372

something like this,

raw.tb %>% group_by(geno) %>% gather(XY, Value, -geno, -ind) %>% arrange(geno, ind)
#> # A tibble: 20 x 4
#> # Groups:   geno [1]
#>       geno    ind    XY Value
#>     <fctr> <fctr> <chr> <int>
#>  1 san1w16     A1     X   467
#>  2 san1w16     A1     X   465
#>  3 san1w16     A1     X   463
#>  4 san1w16     A1     Y   383
#>  5 san1w16     A1     Y   378
#>  6 san1w16     A1     Y   372
#>  7 san1w16     B1     X   464
#>  8 san1w16     B1     X   464
#>  9 san1w16     B1     Y   378
#> 10 san1w16     B1     Y   377
#> 11 san1w16     C1     X   464
#> 12 san1w16     C1     X   464
#> 13 san1w16     C1     Y   376
#> 14 san1w16     C1     Y   375
#> 15 san1w16     D1     X   463
#> 16 san1w16     D1     X   463
#> 17 san1w16     D1     Y   375
#> 18 san1w16     D1     Y   374
#> 19 san1w16     E1     X   463
#> 20 san1w16     E1     Y   373

from there you can apply alommost any function. Here some summarise as suggeted by akrun

raw.tb %>% group_by(geno) %>% gather(XY, Value, -geno, -ind) %>%
           arrange(geno, ind) %>% group_by(ind, geno, XY)  %>%
           summarise(Value = mean(Value))
#> # A tibble: 10 x 4
#> # Groups:   ind, geno [?]
#>       ind    geno    XY    Value
#>    <fctr>  <fctr> <chr>    <dbl>
#>  1     A1 san1w16     X 465.0000
#>  2     A1 san1w16     Y 377.6667
#>  3     B1 san1w16     X 464.0000
#>  4     B1 san1w16     Y 377.5000
#>  5     C1 san1w16     X 464.0000
#>  6     C1 san1w16     Y 375.5000
#>  7     D1 san1w16     X 463.0000
#>  8     D1 san1w16     Y 374.5000
#>  9     E1 san1w16     X 463.0000
#> 10     E1 san1w16     Y 373.0000

or maybe

raw.tb %>% group_by(geno) %>% gather(XY, Value, -geno, -ind) %>%
           arrange(geno, ind) %>% group_by(ind, geno)  %>%
           summarise(Value = mean(Value))
#> # A tibble: 5 x 3
#> # Groups:   ind [?]
#>      ind    geno    Value
#>   <fctr>  <fctr>    <dbl>
#> 1     A1 san1w16 421.3333
#> 2     B1 san1w16 420.7500
#> 3     C1 san1w16 419.7500
#> 4     D1 san1w16 418.7500
#> 5     E1 san1w16 418.0000

Upvotes: 1

Related Questions