user12585925
user12585925

Reputation:

Sorting values on y axis

I have tried many solutions I found in the internet but none of them worked for me sadly. I want to sort my Y axis because for osme reason it randomly picks which value to show first. Here is my code:

ggplot(Dane,aes(Dane$State, Dane$`Unsheltered persons (% Homeless population)`))+
       geom_point(aes(Dane$State, Dane$`Unsheltered persons (% Homeless population)`),, color="red")+
       theme(text = element_text(size=15),axis.text.x = element_text(angle=90, hjust=1))+
       labs(title = "Wykres przedstawiajacy jaka czesc populacji nie ma schronienia",
            x="Stan", y="Brak schronienia")

As you can see Y axis doesn't go from lowest ot highest % value:

As you can see Y axis doesn't go from lowest ot highest % value

Here is how data looks like:

Here is how data looks like

State `Total Homeless~ `Rate of Homele~ `Chronic indivi~ `Chronic Person~ `Chronic Homele~ `Persons in fam~ `Unaccompanied ~
   <chr>            <dbl>            <dbl> <chr>            <chr>            <chr>            <chr>            <chr>           
 1 Alab~             4689              9.7 16.4%            1.9%             18.3%            27.8%            8.4%            
 2 Alas~             1946             26.5 8.5%             0.9%             9.5%             30.0%            8.6%            
 3 Ariz~            10562             15.9 10.1%            1.2%             11.2%            38.4%            6.4%            
 4 Arka~             3812             12.9 14.8%            1.0%             15.8%            16.7%            7.6%            
 5 Cali~           136826             35.7 25.9%            2.8%             28.7%            18.3%            11.3%           
 6 Colo~             9754             18.5 13.9%            4.4%             18.2%            52.2%            5.2%            
 7 Conn~             4448             12.4 19.6%            3.9%             23.5%            30.3%            5.3%            
 8 Dela~              946             10.2 6.9%             0.6%             7.5%             39.2%            3.7%            
 9 Dist~             6865            106.  25.7%            3.8%             29.5%            46.2%            2.4%            
10 Flor~            47862             24.5 16.3%            3.9%             20.2%            34.5%            7.2%            

Upvotes: 0

Views: 237

Answers (2)

Zhiqiang Wang
Zhiqiang Wang

Reputation: 6769

You could also try this before your ggplot:

Dane$`Unsheltered persons (% Homeless population)` <- as.numeric(strsplit(Dane$`Unsheltered persons (% Homeless population)`, "%") 

Upvotes: 1

Greg Snow
Greg Snow

Reputation: 49670

The variables that you are passing to ggplot (more specifically geom_point) look to be character vectors. Internally R is converting the character strings into factors before plotting them and the default order of the levels of factors is lexical (the order that you are seeing in the plots).

There is some variety in how different programs deal with ordering in plots. Older programs (from before rich data structures) would consider the ordering to be a property of the plot, so you would specify any ordering as an option to the plot. R has richer data structures and sees ordering as a property of the data rather than the plot (you can specify it once and have it be consistent in all plots, tables, etc. instead of having to repeat the ordering over and over). This means that the best way to get the ordering you want is to modify your data (data frame or tibble) to have the variable(s) of interest be factors with the ordering that you want, then call ggplot on the modified data.

There are a few ways to do this. Since you are using ggplot2, you probably will not mind using other tidyverse packages. A simple approach is to use the str_sort function from the stringr package:

library(stringr)
Dane$`Unsheltered persons (% Homeless population)` <- factor(Dane$`Unsheltered persons (% Homeless population)`, 
levels=str_sort(unique(Dane$`Unsheltered persons (% Homeless population)`), numeric=TRUE))

There are other ways using relevel or mutate from the dplyr package, or others.

Note that it is better to use data=Dane in the call to ggplot rather than specifying Dane$ before each variable

Upvotes: 1

Related Questions