CLM
CLM

Reputation: 119

how to use IF count in a data.frame column the cells wiht missing value(NA)?

I have a data.frame object named "selPOGs2". In this data.frame I'm adding one column "GeneID". After I transform the data.frame to character, I add data in the "GeneID" column. These data are returned from a query to a database. If no answer is found in the database, then in the corresponding cell in GeneID column a NA is placed. The column looks something like this:

Blockquote

   GeneID
  1. NA
  2. NA
  3. 14297062
  4. 14006762
  5. 11538038

Blockquote

I want to count the number of times the NA is found in the GeneID column. I wrote the following code:

  #convert selPOGs2 from factor to character (to make is is really character and not factor)
  selPOGs2 <- data.frame(lapply(selPOGs2, as.character), stringsAsFactors=FALSE)

  a=0; is.numeric(a)
  for(c in selPOGs2[,1])
  {b <- as.character(c) 
      if(b[1]== NA_character_) 
      {  a=a+1   }
      else {a=a}
  }

I get the following error:

Error in if (b[1] == NA_character_) { : missing value where TRUE/FALSE needed

I get the same error regardless if I compare b[1] with "14297062" or with any other "...".

If I comment the code related to IF , the value for c or for b[1] is reported as "14297062", e.g.

      a=0; is.numeric(a)
      for(c in selPOGs2[,1])
      {b <- as.character(c) 
          #if(b[1]== NA_character_) 
          #{  a=a+1   }
          #else {a=a}
      }

However, as soon as un-comment the IF lines , the value for c or b[1] is reported as NA_character_.

If I use

 a=0; is.numeric(a)
  for(c in selPOGs2[,1])
  {b <- as.character(c) 
      if(1==1) 
  }

the again the value for c or for b[1] is reported as "14297062", e.g.

Upvotes: 1

Views: 425

Answers (1)

Sven Hohenstein
Sven Hohenstein

Reputation: 81683

You can use

sum(is.na(selPOGs2$GeneID))

to count the NAs.

Upvotes: 1

Related Questions