Khashir
Khashir

Reputation: 351

How to plot a “pivot table” with 3 variables, using the number of observations as the size of the point?

I've been trying to get this through a couple of routes, but no luck.

I have two columns, "Type" (strings) and "Units" (integers). I want the x-axis to be Type, the Y-axis to be the sum of the Units per Type, and the size of the points based on the number of observations for each Type (gapminder style).

Here's what I've tried so far:

  1. plot(xtabs(Type ~ Units)): This gets the right x/y relationship, but I don't see how to include the shape/size to be represented by the number of observations
  2. ggplot(data, aes(Type, Units)) + geom_count(): Half a step back, since I've lost the sum, but at least now I have an idea of which Type has more observations.

In a way, I want a static gapminder plot.

Ideally, I would like to do it with ggplot2 (or anything but the default), since the graphs are nicer; but I'll take any pointers.

I feel quite close, but I'm not sure what I'm missing.

Update:

structure(list(CONTRIBUTOR_NAME = c("DATA LTD", 
"GEORGE", "PHIL", "JOHN J LOVE", 
"E LOVE", "MADISON LTD", "NAOMI", 
" HARRISON", " GILL", "RON R ", "MARIBETH ", 
"BEV", "P ANN", "DYCK", "GEHRING", 
"HARVIE ", "SUSAN", "JANE", "RANDY C", 
"GEORGE N "), DATE = structure(c(13511, 15826, 14461, 
16491, 12874, 15295, 15881, 13466, 14373, 16560, 15223, 13518, 
14096, 16555, 15861, 15644, 15923, 13153, 15741, 15091), class = "Date"), 
    AMOUNT = c(500, 200, 12.5, 30, 300, 332, 10, 10, 100, 23, 
    20, 22, 25, 10, 25, 50, 10, 40, 35, 150), CLASS = c("Corporations", 
    "Individuals", "Individuals", "Individuals", "Individuals", 
    "Corporations", "Individuals", "Individuals", "Individuals", 
    "Individuals", "Individuals", "Individuals", "Individuals", 
    "Individuals", "Individuals", "Individuals", "Individuals", 
    "Individuals", "Individuals", "Individuals"), PARTY = c("PARTY 5", 
    "PARTY 1", "PARTY 1", "PARTY 1", "PARTY 1", "PARTY 5", 
    "PARTY 1", "PARTY 5", "PARTY 5", "PARTY 1", 
    "PARTY 1", "PARTY 1", "PARTY 1", "PARTY 5", "PARTY 1", 
    "PARTY 5", "PARTY 1", "PARTY 1", "PARTY 1", "PARTY 1"
    ), BUSINESS = c("Businesses", "Individuals", "Individuals", 
    "Individuals", "Individuals", "Businesses", "Individuals", 
    "Individuals", "Individuals", "Individuals", "Individuals", 
    "Individuals", "Individuals", "Individuals", "Individuals", 
    "Individuals", "Individuals", "Individuals", "Individuals", 
    "Individuals")), .Names = c("CONTRIBUTOR_NAME", "DATE", "AMOUNT", 
"CLASS", "PARTY", "BUSINESS"), row.names = c(NA, -20L), class = c("tbl_df", 
"tbl", "data.frame"))

Upvotes: 0

Views: 488

Answers (1)

G5W
G5W

Reputation: 37641

You don't provide your data, so I can't illustrate with that. I will use the built-in mtcars data instead. You can get the number of observation of each type using table and then adjust the point size using the cex parameter.

plot(unique(mtcars$cyl), as.vector(xtabs(mpg ~ cyl, data=mtcars)), 
    xlim=c(3.5,8.5), ylim=c(130,310), pch=20,
    xlab="Cylinders", ylab="Sum of MPG", 
    cex = table(mtcars$cyl), col="#0000FF44")

Gapminder style

Upvotes: 1

Related Questions