Reputation: 872
I would like to implement concatenation across columns that removes NAs and observes the Oxford comma.
x <- data.frame(ID = 1:3,
col1 = c("snap", "snap", NA),
col2 = c(NA, "crackle", "crackle"),
col3 = c(NA, NA, "pop"),
col4 = c(NA, "yummy", NA))
Using the above dataframe I'd like to concatenate col1:col4 and return the result to x$treats
x$treats[1]
"snap"
x$treats[2]
"snap, crackle, and yummy"
x$treats[3]
"crackle and pop"
The dataset also has an ID variable that should not be included in the concatenation (so solutions that don't allow me to specify the required columns aren't complete).
Upvotes: 3
Views: 92
Reputation: 79198
> x <- data.frame(ID = 1:3,
col1 = c("snap", "snap", NA),
col2 = c(NA, "crackle", "crackle"),
col3 = c(NA, NA, "pop"),
col4 = c(NA, "yummy", NA),stringsAsFactors = F)
> a=gsub("(\\w)\\s+","\\1, ",trimws(do.call(paste,replace(x[-1],is.na(x[-1]),""))))
(x1=transform(x,treat=gsub(",\\s(\\w+)$",", and \\1",a),stringsAsFactors=F))
ID col1 col2 col3 col4 treat
1 1 snap <NA> <NA> <NA> snap
2 2 snap crackle <NA> yummy snap, crackle, and yummy
3 3 <NA> crackle pop <NA> crackle, and pop
> x1$treat[1]
[1] "snap"
> x1$treat[2]
[1] "snap, crackle, and yummy"
> x1$treat[3]
[1] "crackle, and pop"
you can also use collapse
from the glue
package:
x$trat=apply(x[-1],1,function(y)glue::collapse(y[!is.na(y)],", ",last = ", and "))
> x$treat[1]
[1] "snap"
> x$treat[2]
[1] "snap, crackle, and yummy"
> x$treat[3]
[1] "crackle, and pop"
Upvotes: 0
Reputation: 11878
Here's another option, although considerably more verbose. By wrapping the list generation into a function, we can also add an option to disable the Oxford comma, if desired:
x <- data.frame(
ID = 1:3,
col1 = c("snap", "snap", NA),
col2 = c(NA, "crackle", "crackle"),
col3 = c(NA, NA, "pop"),
col4 = c(NA, "yummy", NA)
)
language_list <- function(x, oxford_comma = TRUE) {
x <- x[!is.na(x)]
if (length(x) < 2) {
return(x)
}
last <- tail(x, 1)
rest <- head(x, -1)
if (length(rest) == 1) {
return(paste(rest, "and", last))
}
rest <- paste(rest, collapse = ", ")
paste0(rest, if (oxford_comma) ",", " and ", last)
}
cols <- paste0("col", 1:4)
x$treats <- apply(x[, cols], 1, language_list)
x$treats
#> [1] "snap" "snap, crackle, and yummy"
#> [3] "crackle and pop"
Upvotes: 1