Reputation: 37
I am struggling to order my x axis correctly for a scatter plot where I would like the discrete x axis labels to be ordered in increasing size of a numeric factor of a particular group in second discrete factor. And for this to be separated by facet_grid (or facet_wrap if this is better in this case?) by a fourth discrete factor. I hope that makes sense? If not, hopefully it will once i explain in the example below.
There seems to be a couple of useful pages of help online where im sure the answer is in there somewhere - but i just cant seem to apply it to work in my case.
Here is my example dataset...
Car = c("A","A","A","B","B","C","C","D","D","E","E","F","F","G","G","G","H","H","H","H","I","I","J","J","J","K","K","K","L","L","M","M","N","N","N","O","O","P","P","Q","Q","R","R","S","S","T","T","U","U","U","V","V","V","V","X","X","X")
Area = c("MMR","QRT","VF","QRT","VF","MMR","QRT","MMR","QRT","MMR","QRT","QRT","VF","MMR","QRT","VF","MMR","QRT","PP","VF","QRT","VF","QRT","PP","VF","MMR","QRT","VF","QRT","VF","QRT","VF","MMR","QRT","VF","QRT","VF","QRT","VF","QRT","VF","MMR","QRT","MMR","QRT","MMR","QRT","MMR","QRT","VF","MMR","QRT","PP","VF","MMR","QRT","VF")
Distance = c(100,0.0022,1320,0.002,1056,1030,0.025,62.1,0.06,80,0.011,7.2,100,671,91.677,165,0.61,0.1102,0.08,11.5,0.173,327,0.159,0.82,0.01902,10,0.0079,23,0.186,0.02235,0.038,0.022,100,0.016,0.01359,0.18,0.02291,0.00048,1000,0.007,8.21,1000,0.0349,100,0.0056,100,0.022,100,0.05,13,17.9,0.032,0.22,87,100,0.09,0.0251)
Country = c("UK","UK","UK","UK","UK","UK","UK","UK","UK","UK","UK","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","FR","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM","AM")
df=data.frame(Car, Area, Distance, Country)
df
I wish to have a plot where I have 'Car' on the x-axis and the 'Distance' on the Y-axis. The plot I would like to be split by 'Country' using facet_grid and within each facet Id like the x-axis to be ordered by increasing distance of 'QRT' in the 'Area' factor.
The following codes for a plot which is what I am aiming for (except the x axis sorting issue)
Fig2B<- ggplot(df,aes(x=Car,y=Distance,colour=Area)) +
coord_trans(y = "log10") +
geom_point() +
facet_grid(. ~ Country, scales = "free", space="free")
The closest I have gotten to re-ordering this is through the following helpful post.
Using the following code I can create a new factor that appears to order it correctly.
#Remove grouping
ungroup(df) %>%
# 2. Arrange by
# i. facet group
# ii. bar height
arrange(Country, Distance, Area) %>%
# 3. Add order column of row numbers
mutate(order = row_number())
However I can not work out how to take this to the next stage and use it in my plot using the code in the article. I get the following message...
Don't know how to automatically pick scale for object of type function. Defaulting to continuous. Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 0, 57
Im now not sure where to go from here.
Upvotes: 1
Views: 1643
Reputation: 145775
I can create a new factor that appears to order it correctly.
This is the right goal.
I'd like the x-axis to be ordered by increasing distance of 'QRT' in the 'Area' factor
Okay, so we need this ordering.
order =
## filter down to just QRT
filter(df, Area == "QRT") %>%
## get mean distance for each car (just in case there are
## multiple QRT values for a single car - more general than your example)
group_by(Car) %>%
summarize(qrtdist = mean(Distance)) %>%
## sort ascending
arrange(qrtdist) %>%
## make the Car column a character
mutate(Car = as.character(Car))
So the Car
column of this new order
data set should have the correct ordering. Now we apply this ordering to the original data and the plot will work as desired:
df$Car = factor(df$Car, levels = order$Car)
ggplot(df,aes(x=Car,y=Distance,colour=Area)) +
coord_trans(y = "log10") +
geom_point() +
facet_grid(. ~ Country, scales = "free", space="free")
base
The above was the fancy dplyr
way, but we can actually simplify a lot in this case using base
. There is a command reorder()
for reordering levels of a factor by a function of some other variable.
In this case, we want to reorder
the df$Car
factor, using the values of df$Distance
where df$Area
is "QRT"
.
df$Car = reorder(
# factor to reorder
df$Car,
# vector that is Distance when Area is "QRT" and NA otherwise
ifelse(df$Area == "QRT", df$Distance, NA),
# function of that vector
FUN = mean,
# additional FUN argument: remove NA values
na.rm = TRUE
)
Without all the comments, we can do this:
df$Car = reorder(df$Car, ifelse(df$Area == "QRT", df$Distance, NA), mean, na.rm = TRUE)
ggplot(df,aes(x=Car,y=Distance,colour=Area)) +
coord_trans(y = "log10") +
geom_point() +
facet_grid(. ~ Country, scales = "free", space="free")
Upvotes: 1