Reputation: 1459
I have a dataframe with incorrect values for two variables (Lat and Lon). The incorrect values in the dataframe are listed as 999.00, and the correct values should be 42.68 and -72.47, respectively.
I would like an easy way to replace these values using dplyr, but my attempts (see below) have been unsuccessful (errors provided below).
df$Lat2 <- recode(df$Lat, "999.00"="42.68", .default=x)
Error in lapply(x, f) : object 'x' not found
df <- df %>%
mutate(Lat2 = if_else(Lat == 999.00, 42.68, NULL, NULL))
Error in mutate_impl(.data, dots) : Evaluation error: unused argument (recvLat = 999).
df <- df %>%
mutate(Lat2 = ifelse(Lat == 999.00, 42.68, NULL))
Error in mutate_impl(.data, dots) : Evaluation error: replacement has length zero. In addition: Warning message: In rep(no, length.out = length(ans)) : 'x' is NULL so the result will be NULL
df <- df %>%
mutate(Lat2 = case_when(Lat == 999.00 ~ 42.68, TRUE ~ NULL))
Error in mutate_impl(.data, dots) : Evaluation error: subscript out of bounds.
For the latter three attempts, I get the same error if the number are in quotes (i.e. "999.00" and "42.68")
Upvotes: 1
Views: 241
Reputation: 887971
We can place NULL
in a list
df %>%
mutate(Lat2 = ifelse(recvLat == 999.00, 42.68, list(NULL)))
# recvLat Lat2
#1 999.0 42.68
#2 1.5 NULL
#3 2.5 NULL
Instead of NULL
, it can be NA
df %>%
mutate(Lat2 = ifelse(recvLat == 999.00, 42.68, NA_real_))
# recvLat Lat2
#1 999.0 42.68
#2 1.5 NA
#3 2.5 NA
If we want to do the opposite, just use !=
df %>%
mutate(Lat2 = ifelse(recvLat != 999.00, 42.68, NA_real_))
Based on the OP's comment,
df %>%
mutate(Lat2 = ifelse(recvLat == 999.00, 42.68, recvLat))
In base R
, we can do this by creating an index
i1 <- df$recvLat == 999
df$recLat[i1] <- 42.68
NOTE: Both the solutions work.
df <- data.frame(recvLat = c(999, 1.5, 2.5))
Upvotes: 0
Reputation: 3720
Actually, this is a nice case to show the elegance of data.table
as well.
library(data.table)
## Create example
data <- data.table(lat = c(999, 0, 0),
lon = c(0, 999, 0))
## Reassign values
data[lat==999, lat := 42.68]
data[lon==999, lon := -72.47]
## Print results
data
# lat lon
# 1: 42.68 0.00
# 2: 0.00 -72.47
# 3: 0.00 0.00
The downside is that you have to remember that :=
is needed for assignment.
The upsides are
data.table
syntax is much simpler and consistent (i, j, group by)Upvotes: 1
Reputation: 2467
why not use this.
data=as.data.frame(matrix(0,3,3))
names(data)=c("a","b","c")
data$a[1]=999
data$c[2]=999
data$a[which(data$a==999)]=42.68
data$c[which(data$c==999)]=-72.47
data
a b c
1 42.68 0 0.00
2 0.00 0 -72.47
3 0.00 0 0.00
Upvotes: 2