Reputation:
How can I know how many values are NA in a dataset? OR if there are any NAs and NaNs in dataset?
Upvotes: 8
Views: 33717
Reputation: 3
You can simply get the number of "NA" included in the each column of dataset by using R.
For a vector x
summary(x)
For a data frame df
summary(df)
Upvotes: 0
Reputation: 303
For a dataframe it is:
sum(is.na(df)
here df is the dataframe
where as for a particular column in the dataframe you can use:
sum(is.na(df$col)
or
cnt=0
for(i in df$col){
if(is.na(i)){
cnt=cnt+1
}
}
cnt
here cnt gives the no. of NA in the column
Upvotes: 0
Reputation: 840
This may also work fine
sum(is.na(df)) # For entire dataset
for a particular column in a dataset
sum(is.na(df$col1))
Or to check for all the columns as mentioned by @nicola
colSums(is.na(df))
Upvotes: 14
Reputation: 7464
As @Roland noticed there are multiple functions for finding and dealing with missing values in R (see help("NA")
and here).
Example:
Create a fake dataset with some NA
's:
data <- matrix(1:300,,3)
data[sample(300, 40)] <- NA
Check if there are any missing values:
anyNA(data)
Columnwise check if there are any missing values:
apply(data, 2, anyNA)
Check percentages and counts of missing values in columns:
colMeans(is.na(data))*100
colSums(is.na(data))
Upvotes: 4