swarles-barkley
swarles-barkley

Reputation: 65

ggplot not reading dataframe the way I expected

Recently I've been trying to plot some data using ggplot that's defined like so. (Essentially assigning a different x value to two different data sets and using the y axis to display the points)

  xcol = c(rep(2, length(allTTRs(teamset))))
  ycol = c(allTTRs(teamset))
  xcol2 = c(rep(1, length(allTTRs(oldteamset))))
  ycol2 = c(allTTRs(oldteamset))
  masterY = append(ycol, ycol2)
  masterX = append(xcol, xcol2)
  mat = cbind(masterX, masterY)
  df = as.data.frame(mat)
  show(df)

The show() call outputs this

   masterX   masterY
1        2 10.998817
2        2 10.999933
3        2 37.001567
4        2 15.016150
5        1  2.000817
6        1  5.000150
7        1 13.995800
8        1 11.001933
9        1 24.987017
10       1  0.999850
11       1  2.998750

Next I plot this data like so

  p <- ggplot(data = df, mapping = aes(x = masterX, y = masterY)) +
    geom_dotplot(inherit.aes = TRUE, binwidth = 0.005, data = df, y = masterY, show.legend=TRUE) +
    stat_summary(fun.data = mean_sdl, color = "red")

When I run this, something strange happens. It seems the stat_summary() plots perfectly, but for some reason the geom_dotplot() call transposes the x values, such that the graph looks like this

geom_dotplot mixing up x values

It occurred to me this may be because I specify a 'y' argument in geom_dotplot but no 'x' argument, so I tried including 'x=masterX' in its arguments, but when I do that I get this error.

Error: stat_bindot requires the following missing aesthetics: x

Strangely, when I delete the 'y' argument from the function, I get a similar error for 'y' for the opposite reason. I.e.

Error: geom_dotplot requires the following missing aesthetics: y

Ultimately, I've already fixed this problem by changing masterY/X definitions like so

masterY = append(ycol2, ycol)
masterX = append(xcol2, xcol)

But this is rather unsatisfying to me, since I know it's still not using the x values as tuples, and is instead simply plotting based on the order of the dataframe, and I'd like to learn how to deal with intermixed data for the future. Ultimately, I get the feeling I'm misusing a function or doing something very non-idiomatically, but I'm not sure what.

Could anyone explain why this is happening and/or how I could use ggplot to graph data that might look more like so?

   masterX   masterY
1        2 10.998817
2        2 10.999933
3        2 37.001567
4        1  2.000817
5        2 15.016150
6        1  5.000150
7        1 13.995800
8        1 11.001933
9        1 24.987017
10       1  0.999850
11       1  2.998750

Upvotes: 3

Views: 439

Answers (1)

nograpes
nograpes

Reputation: 18323

I think this will get you what you want:

ggplot(df, aes(x = masterX, y = masterY)) +
  geom_point() +
  stat_summary(fun.data = mean_sdl, color = "red")

Upvotes: 1

Related Questions