spiral01
spiral01

Reputation: 545

R ggplot2: Error when plotting facet grid that doesn't occur for single plot

I have a data frame with a column of continuous variables. I wanted to bin this data into another column so that I could produce a clearer plot. I did this like so:

#Add new column to data frame
mydf2["conDistanceBins"] <- NA

#Bin data from conDistance column of df into 5 bins in new column
mydf2$conDistanceBins <- as.numeric(cut2(mydf2$conDistance, g=5))

Having done this, I proceeded to attempt to plot. Now when I produce a single plot using ggplot2 with the following code my plot comes out correctly and coloured by the bins as I hoped:

p9 <- ggplot(mydf2, aes(x = x, y = y))
p9 + geom_point(aes(color=factor(mydf2$conDistanceBins)))

x and y are columns also within mydf2 data frame.

My issue occurs when I try to produce a facet grid like so:

p7 <- ggplot(mydf2, aes(x, y)) + geom_point(aes(color=factor(mydf2$conDistanceBins)))
p7 + facet_grid(Chromosome~., margins = TRUE)

Chromosome is another column from my data frame. However, when I attempt to run this code I get the following error:

Error: Aesthetics must be either length 1 or the same as the data (12390): colour, x, y

What I do not understand is why in one instance my code is working whilst in the other it is not, when in essence is the second bit of code not just taking the first but creating a facet grid broken up by the Chromosome column of my data frame?

Edit: Here is a portion of my data frame.

           x          y          z       Gene Chromosome Pos.start boot_avg boot_low
1 -0.2201704  2.2914659 -1.0503592 AGAP000002          X       582       46        5
2 -1.6164962 -0.4252216  4.1920188 AGAP000007          X     83817       25        0
3  0.1585863 -2.1869117  0.5772591 AGAP000010          X    120773       79        2
4 -1.5126431 -0.2293787  2.9891040 AGAP000011          X    127704       54       10
5 -1.5382538 -0.1100106 -0.1838767 AGAP000012          X    146181       84       64
  boot_avglow branch_avg branch_low branch_avglow conDistance invDistance
1           9 0.01891250   0.001469      0.001865    4.472136    3.464102
2           0 0.01518050   0.000000      0.000000    6.403124    7.416198
3          39 0.02026960   0.001955      0.003372    3.741657    5.099020
4          10 0.01040867   0.003530      0.003735    6.244998    7.280110
5          67 0.01626420   0.000257      0.001936    4.123106    3.000000
  Acceptable Bootstrap Cluster conDistanceBins invDistanceBins
1      Below threshold       1               3               1
2      Below threshold       2               5               5
3      Above threshold       3               2               2
4      Above threshold       2               5               5
5      Above threshold       4               2               1

Upvotes: 2

Views: 371

Answers (1)

Mike H.
Mike H.

Reputation: 14360

It looks like your problem is coming from the mydf2$conDistanceBins. If you just change mydf2$ConDistanceBins to conDistanceBins I don't get the error anymore. See the code and output below:

p7 <- ggplot(mydf2, aes(x, y)) + geom_point(aes(color=factor(conDistanceBins)))
p7 + facet_grid(Chromosome~., margins = TRUE)

enter image description here

Data: I only used the relevant pieces of your data:

mydf2<- structure(list(x = c(-0.2201704, -1.6164962, 0.1585863, -1.5126431, 
    -1.5382538), y = c(2.2914659, -0.4252216, -2.1869117, -0.2293787, 
    -0.1100106), z = c(-1.0503592, 4.1920188, 0.5772591, 2.989104, 
    -0.1838767), Gene = structure(1:5, .Label = c("AGAP000002", "AGAP000007", 
    "AGAP000010", "AGAP000011", "AGAP000012"), class = "factor"), 
        Chromosome = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "X", class = "factor"), 
        Pos.start = c(582L, 83817L, 120773L, 127704L, 146181L), boot_avg = c(46L, 
        25L, 79L, 54L, 84L), boot_low = c(5L, 0L, 2L, 10L, 64L), 
        boot_avglow = c(9L, 0L, 39L, 10L, 67L), branch_avg = c(0.0189125, 
        0.0151805, 0.0202696, 0.01040867, 0.0162642), branch_low = c(0.001469, 
        0, 0.001955, 0.00353, 0.000257), branch_avglow = c(0.001865, 
        0, 0.003372, 0.003735, 0.001936), conDistance = c(4.472136, 
        6.403124, 3.741657, 6.244998, 4.123106), invDistance = c(3.464102, 
        7.416198, 5.09902, 7.28011, 3), conDistanceBins = c(3, 5, 
        1, 4, 2)), row.names = c("1", "2", "3", "4", "5"), .Names = c("x", 
    "y", "z", "Gene", "Chromosome", "Pos.start", "boot_avg", "boot_low", 
    "boot_avglow", "branch_avg", "branch_low", "branch_avglow", "conDistance", 
    "invDistance", "conDistanceBins"), class = "data.frame")

Upvotes: 2

Related Questions