g.humpkins
g.humpkins

Reputation: 341

ggplot2 is incorrectly sorting the X axis of my graph

I am trying to create a plot of user login behavior for a two month period. I used the qplot function from the ggplot2 package, and with the following code

qplot(date_time, login_count, data=client_login_clean)

I plotted login_count over time as shown below. Unfortunately, the y-axis, num_records is sorted such that the first five tick marks on the y-axis are 1, 10, 106, 11, and 12, rather than 1, 2, 3, 4, 5. Could someone let me know how to fix this?

User logins over time

Upvotes: 0

Views: 585

Answers (2)

Ari B. Friedman
Ari B. Friedman

Reputation: 72779

You almost certainly have your variable encoded as a factor rather than an integer. Try login_count <- as.numeric( as.character( login_count ) ) then run it again. An alternative is taRifx::destring.

As a side note, a wizard never mis-sorts his axis. He sorts it precisely as he means to.

Upvotes: 2

shadowtalker
shadowtalker

Reputation: 13913

That's because, for some reason, your login_count variable is a character vector. ggplot internally coerces all character vectors to factors, with labels ordered alphabetically, and then sorts the axis according to that order.

I also think I know why this happened: "num_records" is actually a value in your login_count column, so the whole thing has been coerced to a character vector. Delete that element and use as.numeric, then the ordering should be correct. This is a good opportunity to read over your data loading/generating process and make sure you haven't made any other mistakes. Sometimes the tiniest bugs can uncover massive problems you would never have noticed otherwise.

As a side note, this is why you should be careful with character-class variables and ggplot plotting. You can save yourself a lot of headaches by explicitly specifying a factor ordering up front.

Upvotes: 3

Related Questions