anthr
anthr

Reputation: 1036

Plot a 'top 10' style list/ranking in R based on numerical column of dataframe

I have an R dataframe that contains a string variable and a numerical variable, and I would like to plot the top 10 strings, based on the value of the numerical variable.

I can of course get the top 10 entries pretty simply:

top10_rank <- rank[order(rank$numerical_var_name),]

My first approach to trying to visualize this was to simple attempt to plot this like:

ggplot(data=top10_rank, aes(x = top10_rank$numerical_var_name, y = top10_rank$string_name)) + geom_point(size=3)

And to a first approximation this "works" - the problem is that the strings on the y axis are sorted alphabetically rather than by the numerical value.

My preference would be to find a way to plot the top 10 strings without having to bother showing the numerical variable at all - just basically as a list (even better would be if I could enumerate the list). I am attempting to plot this so it looks more pleasing than simply dumping the text to the screen.

Any ideas greatly appreciated!

Upvotes: 0

Views: 5019

Answers (1)

small_data88
small_data88

Reputation: 380

The y-axis tick marks may be sorted alphabetically, but the points are drawn in order(from left to right) of the top10_rank dataframe. What you need to do is change the order of the y-axis. Add this to your call of ggplot + scale_y_discrete(limits=top10_rank$String) and it should work.

ggplot(data=top10_rank, aes(x = top10_rank$Number, 
y = top10_rank$String)) + geom_point(size=3) + scale_y_discrete(limits=top10_rank$String)

Here is a link to a great resource on R graphics: R Graphics Cookbook

Upvotes: 2

Related Questions