Reputation: 79
With a dataframe as
df <- data.frame(name = c("a", "b", "c", "d", "e"),
class = c("a1", "a1", "a1", "b1", "b1"),
var1 = c("S", "S", "R", "S", "S"),
var2 = c("S", "R", NA, NA, "R"),
var3 = c(NA, "R", "R", "S", "S"))
I would like to plot the number of rows without NAs for var1 from var3.
One way I found is to generate another dataframe as
df_count <- matrix(nrow=3, ncol=2)
df_count <- as.data.frame(df_count)
names(df_count) <- c("var_num", "count")
df_count$var_num <- as.factor(names(df)[3:5])
for (i in 1:3) {
df_count[i,2] <- sum(!is.na(df[,i+2]))
}
and then plot as
ggplot(df_count, aes(x=var_num, y=count)) + geom_bar(stat="identity")
Is there an easier way to choose var1 through var3 and count the valid rows without generating a new dataframe?
Upvotes: 3
Views: 2645
Reputation: 12713
library('ggplot2')
library('reshape2')
df <- melt(df, id.vars = c('name', 'class')) # melt data
df <- df[!is.na(df$value), ] # remove NA
df <- with(df, aggregate(df, by = list(variable), FUN = length )) # compute length by grouping variable
ggplot(df, aes( x = Group.1, y = value, fill = Group.1 )) +
geom_bar(stat="identity")
df <- melt(df, id.vars = c('name', 'class')) # melt data
df <- df[!is.na(df$value), ] # remove NA
df <- with(df, aggregate(df, by = list(variable, value), FUN = length )) # compute length by grouping variable and value
ggplot(df, aes( x = Group.1, y = value, fill = Group.2 )) +
geom_bar(stat="identity")
Data:
df <- data.frame(name = c("a", "b", "c", "d", "e"),
class = c("a1", "a1", "a1", "b1", "b1"),
var1 = c("S", "S", "R", "S", "S"),
var2 = c("S", "R", NA, NA, "R"),
var3 = c(NA, "R", "R", "S", "S"))
Upvotes: 3