R package/functions allowing two kinds of missing values (kept and not in regressions) and handling labelled variables both as numeric and character

Question

I want to handle survey data and I would like functions ideal_labelled and is.missing with the following behavior:

test <- ideal_label(c(1, NA, -1), 
                    labels = structure(c(0, 1, -1), names = c("No", "Yes", "PNR")), 
                    missing.values = c(NA, -1))
  
as.character(test[1])   # "Yes"
as.numeric(test[1])   # 1
test %in% 1   # TRUE FALSE FALSE
test == 1   # TRUE NA FALSE
test %in% "Yes"   # TRUE FALSE FALSE
test == "Yes"   # TRUE NA FALSE
is.na(test)   # FALSE TRUE FALSE
is.missing(test)   # FALSE TRUE TRUE 
lm(c(T, T, T) ~ test)$rank   # 2 (i.e., keeps missing values that are not NA)
df <- data.frame(test = test, true = c(T, T, T))
lm(true ~ test, data = df)$rank   # 2

This used to be possible with function as.item (and is.missing) of package memisc, with memisc version 0.99.22 and R Version 4.2.1.

However, more recent versions of memisc treat missing values the same as NA (i.e., is.na(test[3]) returns TRUE). And using memisc version 0.99.22 with more recent versions of R tend to treat labelled variables as numerical rather than characters (namely, test[1] == "Yes" returns NA and test[1] %in% "Yes" returns FALSE).

I have tested other packages (haven, labelled, forcats) but none of them seem to allow the behavior I need.

How do I achieve this with the latest versions of these libraries?

R package/functions allowing two kinds of missing values (kept and not in regressions) and handling labelled variables both as numeric and character

Answers (1)

Related Questions