beginner
beginner

Reputation: 1059

Box plot with different variable length columns b/w two data frames

I have two dataframes. Their lengths differ.

df1:
 Samples   Number
 A9GS        73
 A9GY        142
 ASNO        327
 A5UE        131

df2:
 Samples   Number
 AUFS        107
 A9JY        42
 AKNO        32
 A9FE        111
 A9GY        12
 ADNO        37
 A2KE        451

I have done wilcoxon test on this.

wilcox.test(df1$Number,df2$Number, correct=FALSE)

This gave me p-value. And to visualise this I used box plot function and gave an error like following.

boxplot(df1$Number ~ df2$Number, xlim=c(0.5,3))
Error in model.frame.default(formula = df1$Number ~ df2$Number) : 
  variable lengths differ (found for 'df2$Number')

Can anyone correct my mistake and also tell me how to get p-value on the plot. Thank you

Upvotes: 2

Views: 4335

Answers (2)

IRTFM
IRTFM

Reputation: 263331

You would only be able to use the formula if there were a 1-1 pairing of those to dataframes (with the RHS usually a group variable rather than a numeric one), which clearly there is not. You need to use the list delivery system rather than the formula one. I'll see if I can construct a working example.

The plot is achieved with:

png(); boxplot( list(df1_N=df1$Number, df2_N = df2$Number) ); dev.off()

enter image description here

And annotation can be done with the text function which accepts a ?plotmath argument typically constructed with bquote.

text( 1.5, 400, 
   label=bquote( 
       p~value == .(wilcox.test(df1$Number,df2$Number, correct=FALSE)$p.value)
    ) )

If you wanted to round the p-value use round( ... ) around the expression inside the .( )-function

Upvotes: 2

triddle
triddle

Reputation: 1231

Just put the two data frames together, and then paste the pvalue onto the plot:

df1 <- data.frame(samples = c('A9GS', 'A9GY', 'ASNO', 'ASUE'),
                      number = c(73, 142, 327, 131))
df2 <- data.frame(samples=c('AUFS', 'A9JY', 'AKNO', 'A9FE', 'A9GY', 'ADNO', 
                                'A2KE'),
                      number = c(107, 42, 32, 111, 12, 37, 451))

df1$group <- 'df1'
df2$group <- 'df2'

df <- rbind(df1, df2)

m<-wilcox.test(df1$number,df2$number, correct=FALSE)

library(ggplot2)
jpeg('path/to/where/you/want/the/file/saved/picture.jpeg')
ggplot(df, aes(x=group, y=number, group=group)) + 
  geom_boxplot() +
  annotate('text', label=paste('p =', round(m$p.value, 2)), x=.5, y=400)
dev.off()

yields: enter image description here

Upvotes: 0

Related Questions