Jim O.
Jim O.

Reputation: 1111

dropping NA in a list

After many tries and failures, I bring the following data:

canada <- c(100, 80, 100, 100, 20)
korea <- c(100, "", 100, "", "")
brazil <- c(100, 90, 100, 30, 30)
fruit <- rbind(canada, korea, brazil)
colnames(fruit) <- c("apple", "orange", "banana", "grape", "kiwi")
fruit

I want it to look like this:

> price("korea")
Thank you, StackOverflow, for the delicious apple and banana.

So I tried the following:

price <- function(val){
  val <- tolower(val)
  myrow <- fruit[val,]
  nation <- tools::toTitleCase(val)

  name.min <- names(myrow)[which.min(c(myrow))]
  name.max <- sapply(seq_along(myrow), 
                     function(x, n, i) {paste0(n[i])},
                     x=myrow, n=names(na.omit(which(myrow == max(myrow)))))
  name.max[length(name.max)] <- paste0("and ", name.max[na.omit(length(name.max))])
  name.max <- paste(name.max, collapse = ", ")

  cat(paste0("My fruits have a NAsty taste at the end: ", name.max))
} 

Which printed the following:

> price("korea")
My fruits have a NAsty taste at the end: apple, banana, NA, NA, and NA

Am I using the na.omit function wrong?

Upvotes: 2

Views: 73

Answers (2)

Damiano Fantini
Damiano Fantini

Reputation: 1975

I think you could simply wrap your vector myrow in na.omit.

as.vector(na.omit(myrow))

For example

as.vector(na.omit(c(NA, 2,2,3)))
[1] 2 2 3

So, if you just want to get your name.min or name.max, you could do as follows.

name.min <- names(myrow)[which.min(na.omit(myrow))]
name.max <- names(myrow)[which.max(na.omit(myrow))]

This way, you could omit the sapply() call and replace as follows.

price <- function(val){
  val <- tolower(val)
  myrow <- fruit[val,]
  nation <- tools::toTitleCase(val)
  #
  if (min(myrow, na.rm = TRUE) == max(myrow, na.rm = TRUE)) {
    name.max <-  as.vector(na.omit(names(myrow)[myrow == max(myrow, na.rm = TRUE)]))
  } else {
    name.max <- names(myrow)[which.max(na.omit(myrow))]
  }
  cat(paste0("My fruits have a NAsty taste at the end: ", paste(name.max, collapse = ", ")))
} 

Result should be as expected.

# data
canada <- as.numeric(c(100, 80, 100, 100, 20))
korea <- as.numeric(c(100, "", 100, "", ""))
brazil <- as.numeric(c(100, 90, 100, 30, 30))
fruit <- rbind(canada, korea, brazil)
colnames(fruit) <- c("apple", "orange", "banana", "grape", "kiwi")

# run
price("korea")
My fruits have a NAsty taste at the end: apple, banana

Upvotes: 1

Christoph Wolk
Christoph Wolk

Reputation: 1758

You seq_along myrow, and use that to access n, but n has fewer elements than myrow because you remove all those elements whose value is smaller than max(myrow)

If you change that, you will get only those where the value is equal to the max:

price <- function(val){

  val <- tolower(val)
  myrow <- fruit[val,]
  nation <- tools::toTitleCase(val)

  name.min <- names(myrow)[which.min(c(myrow))]
  name.max <- sapply(seq_along(na.omit(which(myrow == max(myrow)))), 
                     function(x, n, i) {paste0(n[i])},
                     x=myrow, n=names(na.omit(which(myrow == max(myrow)))))
  name.max[length(name.max)] <- paste0("and ", name.max[na.omit(length(name.max))])
  name.max <- paste(name.max, collapse = ", ")

  cat(paste0("My fruits don't have a NAsty taste at the end: ", name.max))
} 

That said, it's a bit unclear what you want the function to do. I'm sure there's a more elegant way to do it. For example, why do you calculate name.min and never use it?

If you just want the column names where the value is equal to the max value, this is much more readable and efficient:

name.max <- names(myrow[myrow == max(myrow)])

Upvotes: 1

Related Questions