Reputation: 1011
I have a custom boolean function which checks a string (my actual function does more than that provided below, which is just provided as an illustrative example).
If I use the first version with dplyr::mutate() it just applies to the first value and then sets all rows to be that answer.
I can wrap the function in a purr::map() however this seems very slow on larger datasets. It also doesn't seem to be the way that mutate normally works.
library(tidyverse)
valid_string <- function(string) {
# Check the length
if (stringr::str_length(string) != 10) {
return(FALSE)
}
return(TRUE)
}
# Create a tibble to test on
test_tib <- tibble::tibble(string = c("1504915593", "1504915594", "9999999999", "123"),
known_valid = c(TRUE, TRUE, TRUE, FALSE))
# Apply the function
test_tib <- dplyr::mutate(test_tib, check_valid = valid_string(string))
test_tib
valid_string2 <- function(string) {
purrr::map_lgl(string, function(string) {
# Check the length
if (stringr::str_length(string) != 10) {
return(FALSE)
}
return(TRUE)
})
}
# Apply the function
test_tib <- dplyr::mutate(test_tib, check_valid2 = valid_string2(string))
test_tib
Upvotes: 1
Views: 286
Reputation: 2867
I would suggest you rewrite your function as vectorized
function like this:
valid_string <- function(string) {
# Check the length
ifelse(stringr::str_length(string) != 10, FALSE, TRUE)
}
Another option would be the Vectorize
function from base
which would work like this:
valid_string2 <- function(string) {
# Check the length
if(stringr::str_length(string) != 10) {
return(FALSE)
}
return(TRUE)
}
valid_string2 <- Vectorize(valid_string2)
Both work pretty good, however I would suggest the solution with ifelse
.
# Create a tibble to test on
test_tib <- tibble::tibble(string = c("1504915593", "1504915594", "9999999999", "123"),
known_valid = c(TRUE, TRUE, TRUE, FALSE))
# Apply the function
test_tib <- dplyr::mutate(test_tib, check_valid = valid_string(string))
test_tib <- dplyr::mutate(test_tib, check_valid2 = valid_string2(string))
test_tib
string known_valid check_valid check_valid2
<chr> <lgl> <lgl> <lgl>
1 1504915593 TRUE TRUE TRUE
2 1504915594 TRUE TRUE TRUE
3 9999999999 TRUE TRUE TRUE
4 123 FALSE FALSE FALSE
Upvotes: 1
Reputation: 4400
Is this what you are looking for?
test_tib <- dplyr::mutate(test_tib, checkval = ifelse(nchar(string)!=10,FALSE,TRUE))
Upvotes: 0