Reputation: 13
I am trying to create a new column that converts FIPS code to the state abbreviation using library(usmap), the problem is that the new column after using mutate does not match the matrix size. The new column only has 51 rows after using fips_info, but not 23570 rows of the original matrix.
Appreciate any help, thanks!
#defined function to get state abb
fips_function <- function(fips_code){
return (fips_info(fips_code)$abbr)
}
atus_19_selected <- act_19 %>%
mutate(state_abb = fips_function(GESTFIPS))
Error: Problem with `mutate()` input `state_abb`.
x Input `state_abb` can't be recycled to size 23570.
ℹ Input `state_abb` is `fips_function(GESTFIPS)`.
ℹ Input `state_abb` must be size 23570 or 1, not 51.
atus_19_selected
# A tibble: 23,570 x 8
GESTFIPS GTCO TUCASEID t150701 t150799 t150801 t150899 t159999
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 40 000 2.02e13 60 0 0 0 0
2 51 153 2.02e13 40 0 0 0 260
Upvotes: 1
Views: 127
Reputation: 887183
The issue would be that some of the values are duplicates, so, it would return the error. An option is rowwise
library(usmap)
library(dplyr)
act_19 %>%
rowwise %>%
mutate(state_abb = fips_function(GESTFIPS)) %>%
ungroup
-output
# A tibble: 3 x 2
# GESTFIPS state_abb
# <dbl> <chr>
#1 40 OK
#2 51 VA
#3 40 OK
Or another option is to run this on the distinct
values of 'GESTFIPS' and then do a join
act_19 %>%
distinct(GESTFIPS) %>%
mutate(state_abb = fips_function(GESTFIPS)) %>%
right_join(act_19)
-output
# A tibble: 3 x 2
# GESTFIPS state_abb
# <dbl> <chr>
#1 40 OK
#2 40 OK
#3 51 VA
The error can be reproduced with the duplicate values
act_19 %>%
mutate(state_abb = fips_function(GESTFIPS))
Error: Problem with
mutate()
inputstate_abb
. ✖ Inputstate_abb
can't be recycled to size 3. ℹ Inputstate_abb
isfips_function(GESTFIPS)
. ℹ Inputstate_abb
must be size 3 or 1, not 2. Runrlang::last_error()
to see where the error occurred.
This issue arises directly from subsetting
usmap:::get_fips_info
function (fips)
{
if (all(nchar(fips) == 2)) {
df <- utils::read.csv(system.file("extdata", "state_fips.csv",
package = "usmap"), colClasses = rep("character",
3), stringsAsFactors = FALSE)
result <- df[df$fips %in% fips, ] # -> would subset only for unique fips
...
act_19 <- tibble(GESTFIPS = c(40, 51, 40))
Upvotes: 1
Reputation: 160447
When fips_info
cannot match a FIPS code for some reason, it does not return anything for that entry, so you cannot guarantee a 1-to-1 input/output relationship.
Using a known-defect highlights this:
act_19 <- structure(list(GESTFIPS = c(40L, 99L), GTCO = c(0L, 153L), TUCASEID = c(2.02e+13, 2.02e+13), t150701 = c(60L, 40L), t150799 = c(0L, 0L), t150801 = c(0L, 0L), t150899 = c(0L, 0L), t159999 = c(0L, 260L)), class = "data.frame", row.names = c("1", "2"))
usmap::fips_info(act_19$GESTFIPS)
# Error in fips_info.numeric(act_19$GESTFIPS) :
# Invalid FIPS code(s), must be either 2 digit (states) or 5 digit (counties), but not both.
usmap::fips_info(as.character(act_19$GESTFIPS))
# Warning in get_fips_info(fips_) :
# FIPS code(s) 99 not found, excluded from result.
# abbr fips full
# 1 OK 40 Oklahoma
I suggest an alternative method:
abbrs <- usmap::fips_info(as.character(unique(act_19$GESTFIPS)))
# Warning in get_fips_info(fips_) :
# FIPS code(s) 99 not found, excluded from result.
abbrs
# abbr fips full
# 1 OK 40 Oklahoma
abbrs %>%
transmute(state_abb = abbr, GESTFIPS = as.integer(fips)) %>%
right_join(act_19, by = "GESTFIPS")
state_abb GESTFIPS GTCO TUCASEID t150701 t150799 t150801 t150899 t159999
1 OK 40 0 2.02e+13 60 0 0 0 0
2 <NA> 99 153 2.02e+13 40 0 0 0 260
You may still have entries with incorrect state_abb
, but at least you'll retain all of your previous data and won't get that error.
Upvotes: 1