Sultan
Sultan

Reputation: 187

box-plot for multiple columns with normalized x-axis values

I have the following data (in csv file)

 product release_after_issue  release_before_issue
 P1                           40
 P1      100    
 P1                           10
 P2      50
 P2      300
 P2                           200
 P3      10
 P3      20
 P3      300    

I would like use the box-plot to show the distribution of days for each product release (P1, P2, etc.) based on release_after_issue and release_before_issue. The x-axis is the products names and y-axis is days.

The issues that I am facing now are:the empty values in each column, and the big number for the days.

How could I normalize the days in y-axis to be in month (easy to read)? And I wold like to have each product (Ps) has its own box plot based on the column's data (release_after_issue or release_before_issue)

I tried to omit NA values and plot test example, but it did not work

data <- read.csv("commons-fileupload.csv")
    ggplot(data[!is.na(data$release_after_issue),],aes(x=product,y=release_after_issue))
    + geom_point()

Any help !

Upvotes: 0

Views: 678

Answers (1)

Julien Massardier
Julien Massardier

Reputation: 1476

Not sure what fails in your code, the dummy data below works fine for me. Also, ggplot removes the NAs for you.

data <- data.frame(product=c("P1","P2","P1","P1","P2"),release_after_issue=c(100,NA,50,10,30))
ggplot(data,aes(x=product,y=release_after_issue))+ geom_boxplot()

enter image description here

Upvotes: 1

Related Questions