Reputation: 3643
I'm sure there's a great reason for this that I am not finding at the moment, but ... why does dplyr coerce characters to factors, even when you explicitly coerce to character?
> letters
[1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"
> typeof(letters)
[1] "character"
> data.frame(
+ colA = as.character(letters),
+ colB = as.character(LETTERS)
+ ) %>%
+ glimpse
Observations: 26
Variables: 2
$ colA <fct> a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z
$ colB <fct> A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z
Upvotes: 2
Views: 209
Reputation: 887571
It is not the dplyr
that coerce it to factor
, but is the data.frame
(base R
constructor), where the default option is stringsAsFactors = TRUE
. Specifying stringsAsFactors = FALSE
will rectify the issue
data.frame(
colA = letters,
colB = LETTERS, stringsAsFactors = FALSE
)
NOTE: There is no need to wrap as.character
As we are using tidyverse
, an option is tibble
, which will have the default setting of stringsAsFactors = FALSE
tibble(colA = letters, colB = LETTERS)
Upvotes: 5