Reputation: 1036
I have an R dataframe that contains a string variable and a numerical variable, and I would like to plot the top 10 strings, based on the value of the numerical variable.
I can of course get the top 10 entries pretty simply:
top10_rank <- rank[order(rank$numerical_var_name),]
My first approach to trying to visualize this was to simple attempt to plot this like:
ggplot(data=top10_rank, aes(x = top10_rank$numerical_var_name, y = top10_rank$string_name)) + geom_point(size=3)
And to a first approximation this "works" - the problem is that the strings on the y axis are sorted alphabetically rather than by the numerical value.
My preference would be to find a way to plot the top 10 strings without having to bother showing the numerical variable at all - just basically as a list (even better would be if I could enumerate the list). I am attempting to plot this so it looks more pleasing than simply dumping the text to the screen.
Any ideas greatly appreciated!
Upvotes: 0
Views: 5019
Reputation: 380
The y-axis tick marks may be sorted alphabetically, but the points are drawn in order(from left to right) of the top10_rank dataframe. What you need to do is change the order of the y-axis. Add this to your call of ggplot + scale_y_discrete(limits=top10_rank$String)
and it should work.
ggplot(data=top10_rank, aes(x = top10_rank$Number,
y = top10_rank$String)) + geom_point(size=3) + scale_y_discrete(limits=top10_rank$String)
Here is a link to a great resource on R graphics: R Graphics Cookbook
Upvotes: 2