Reputation: 619
I am looking to create a new variable that indicates if an individual is to the left or right side (if their 'y' value is greater than 'a' they're to the left, if their y value is less than a they are to the right). I have tried this code:
df <- df %>%
mutate(Side = case_when(y > a ~ "Left",
y < a ~ "Right",
y = a ~ "C"))
However, when I try it I get this error:
Error: Problem with `mutate()` input `Side`.
x object 'y' not found
ℹ Input `Side` is `case_when(y > a ~ "Left", y < a ~ "Right", y = a ~ "C")`.
I am very lost since both a and y are numeric vectors. Any idea why this is?
structure(list(y = c(26.85, 26.85, 26.85, 26.85, 26.85,
26.85, 26.85, 26.85, 26.85, 26.85, 26.85, 26.85, 26.85, 26.85,
26.85, 26.85, 26.85, 26.85, 26.85, 26.85), a = c(26.67, 36.47,
44.16, 22.01, 36.15, 28.7, 26.63, 31.12, 20.53, 43.49, 21.83,
26.59, 26.71, 26.85, 26.67, 36.47, 44.17, 22, 36.15, 28.7)), row.names = c(NA,
-20L), class = c("tbl_df", "tbl", "data.frame"))
``
Upvotes: 0
Views: 123
Reputation: 160717
=
is assignment, ==
is a test of equality. Change to y == a ~ "C"
.
df %>%
mutate(Side = case_when(y > a ~ "Left",
y < a ~ "Right",
y == a ~ "C"))
# # A tibble: 20 x 3
# y a Side
# <dbl> <dbl> <chr>
# 1 26.8 26.7 Left
# 2 26.8 36.5 Right
# 3 26.8 44.2 Right
# 4 26.8 22.0 Left
# 5 26.8 36.2 Right
# 6 26.8 28.7 Right
# 7 26.8 26.6 Left
# 8 26.8 31.1 Right
# 9 26.8 20.5 Left
# 10 26.8 43.5 Right
# 11 26.8 21.8 Left
# 12 26.8 26.6 Left
# 13 26.8 26.7 Left
# 14 26.8 26.8 C
# 15 26.8 26.7 Left
# 16 26.8 36.5 Right
# 17 26.8 44.2 Right
# 18 26.8 22 Left
# 19 26.8 36.2 Right
# 20 26.8 28.7 Right
Having said that ... beware, floating-point equality is not always perfect. Computers have limitations when it comes to floating-point numbers (aka double
, numeric
, float
). This is a fundamental limitation of computers in general, in how they deal with non-integer numbers. This is not specific to any one programming language. There are some add-on libraries or packages that are much better at arbitrary-precision math, but I believe most main-stream languages (this is relative/subjective, I admit) do not use these by default. Refs: Why are these numbers not equal?, Is floating point math broken?, and https://en.wikipedia.org/wiki/IEEE_754.
If you find yourself in a situation where you "know" that two numbers are the same but ==
returns FALSE
, consider converting to a measure of absolute-difference, perhaps something like:
eps <- 1e-8
df %>%
mutate(Side = case_when(y > a ~ "Left",
y < a ~ "Right",
abs(y - a) < eps ~ "C"))
The actual value to use for eps
is specific to your needs; given the data we see here in df
, then 1e-8
is likely to be sufficient. In other data, look for a value that is well smaller than the expected range, but still an order of magnitude or more larger than .Machine$double.eps
(about 2e-16
).
Upvotes: 3