ghs101
ghs101

Reputation: 113

error using the map function from purr package

I want to use the map function but I have trouble fixing an error.

> values <- T8_mut %>% select (start)
> values
    start
1  610661
2 1366584
3 1570287
4 1948432
5 2047458
> get_pos1 <- function(x) {
+     T8 %>% 
+     mutate(abs_diff = abs(start - x)) %>% arrange(abs_diff) %>% slice_head(n = 1) 
+ }
> nucdiff_genes <- as.data.frame(map(values, get_pos1 ))
Warning message:
Problem while computing `abs_diff = abs(start - x)`.
ℹ longer object length is not a multiple of shorter object length 



> T8 %>% mutate(abs_diff = abs(start - 610661)) %>% arrange(abs_diff) %>% slice_head(n = 1) 
  seqnames  start    end width strand   source type score phase         ID                           Name  locus_tag                        product       Dbxref
1 contig_1 610793 611278   486      + Prodigal  CDS    NA     0 C347_02750 IS200/IS605 family transposase C347_02750 IS200/IS605 family transposase COG:COG1....
             gene inference anti_codon amino_acid pseudo            func abs_diff
1 tnp-IS200,iS605      <NA>       <NA>       <NA>   <NA> tnp-IS200,iS605      132
> 

the code works when I fill it in manually however, when using the map function, I get the error shown above.

Upvotes: 0

Views: 125

Answers (1)

Andy Baxter
Andy Baxter

Reputation: 7646

It could be that when you're calling map on values what you're actually telling it to iterate over is a dataframe of 1 column. R takes that to mean "for each column do this function". To map over the values contained in the column itself, pull it from the dataframe to create a vector:

library(tidyverse)

T8 <- tibble(id = letters[1:10],
             start = sample(1000000:9999999, 10))


values <- T8 |> pull(start)

get_pos1 <- function(x) {
  T8 %>%
    mutate(abs_diff = abs(start - x)) %>% 
    arrange(abs_diff) %>% 
    slice_head(n = 1)
}

map_df(values, get_pos1)
#> # A tibble: 10 × 3
#>    id      start abs_diff
#>    <chr>   <int>    <int>
#>  1 a     8122151        0
#>  2 b     5460739        0
#>  3 c     2863191        0
#>  4 d     1997677        0
#>  5 e     2303171        0
#>  6 f     6267346        0
#>  7 g     1511788        0
#>  8 h     2189209        0
#>  9 i     6638907        0
#> 10 j     8081678        0

Albeit, it's still unclear what you're trying to achieve and we can't try this on your actual data with no data sample. Feel free to edit question if further info is needed.

Upvotes: 1

Related Questions